George K, Bennett, Psychological Corporation 
Walter V. Bingham, Washington, D. C. 
Harold E, Burtt, Ohio State University 
Allen L. Edwards, University of Washington 
Clifford E, Jurgensen, Minneapolis Gas Co. 
Irving Lorge, T. C. Columbia University 
Quinn McNemar, Stanford University 


Edited by 
Donald G. Paterson 


University of Minnesota 


Journal of Applied Psychology 


Y35 $ 


Consulting Editors 


Alexander Mintz, City College of New York 
James P. Porter, Danville, Illinois 
Julian B. Rotter, Ohio State University 
Edward K. Strong, Jr., Stanford University 
Donald E. Super, T. C. Columbia University 
Morris S. Viteles, University of Pennsylvania 
Alfred C. Welch, Knox-Reeves, Minneapolis 


Volume 34, 1950 


Published Bi-monthly by the American Psychological Association, Inc. 
Prince and Lemon Sts., Lancaster, Pa. 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the Act of March 3, 1879 


ili i ostage provided for in paragraph (d-2), Section 34.40, 
Aecenfse foc Sega ie 5 of oda authorized October 10, 1947 


Copyright, 1950, by The American Psychological Association, Inc. 


Contents of Volume 34 


Articles 
Ammons, R. B., Butler, M. N. and Herzig, S. A. A Projective Test for Vocational 
Research and Guidance at the College Level. E Dr cera ETC: 198 
Baas, M. L. Kuder Interest Patterns of Psychologists. s: : eve nermmmennsr ura 115 
Barnette, W. L., Jr. The Non-Respondent Problem in Questionnaire Research... 397 
Barnette, W. L., Jr. Reactions of Veterans to Cotinseling... ac ir rai tena 399 
Berdie, R. F. Scores on the Strong Vocational Interest Blank and the Kuder Pref- 
erence Record in Relation to Self Ratings..............00000.............. 2 
Bittner, R. H. and Rundquist, E. A. The Rank-Comparison Rating Method... .. 171 
Brayfield, A. H. and Reed, P. A. How Readable are Occupational Information 
D MNA, OMEN NCMO 325 
Browne, C. G. Study of Executive Leadership in Business. II. Social Group 
SUD e NM CHEM NONEM A M 12 
Browne, C. G. Study of Executive Leadership in Business. III. Goal and Achieve- 
dn lon En q. xd 82 
Browne, C. G. The Concentric Organization Chart. iec iscsi. son anenwamacas 375 
Burke, L, K. and Taylor, E. K. Rating Training and Experience............... 381 
Campbell, D. T. and Mohr, P. J. The Effect of Ordinal Position upon Responses to K 
I i CMM E E a N 
a "RO hospi, HL F. Validity of an Objectivity Key on a Short Industrial i 
78 


nique for Investigating Certain Reading Abilities of College Students.......... 267 


Corso, J. F. and Lewis, D. Preferred Rate and Extent of the Frequency Vibrato... . 206 
Cottle, W. C. Card Versus Booklet Forms of the MMPI. IEEE ee 255 
Over, €. B. and Pressey, S. L. Age and Route Sales Efficiency................ 29 


System. css ay S des s comedite s ditur oes oes w elem i eee 
de Cie 0. E. and Orbison, W. D. A Comparison of the Terman-Miles M-F Test 


and the Mf Scalevot the MMPI...... eene raa te meat ERES ba aot oe tienes 


DuBoj P. H. and Watson, R. I. The Selection of Patrolmen iG ERR e A 90 
Dunlap" J. W. The Effect of Color in Direct Mail Advertising.........,,,,, 280 
Sdwards, A. S. The Myth’of Chronological Age... eoe Be eee p iuie 316 


et ppm oT Te oe 12 
Norms for College Groups...........-+-- i 42 
England A.O. Getting Your Message Across by PlainiTali. -. ec... dsc. sy ns 182 
rare, J. N. Readability and Interest Values in an Employee Handbook... uh 
Flesch, R Measuring the Level of Abstraction......... Lowe ELSE T eT oe 384 
Ford À Rigler D. and Dugan, G.E. Point Centering of Signals on an Area... .... 429 


Le D» Terana E E Rede Re 406 
Bein & T Cari A. R. and Dennegar, L. S. Attitudes of Veterans toward Voca- 


tional Guidance Services 


iv Contents of Volume 34 


Gilbert, C. The Guilford-Zimmerman Temperament Survey and Certain Related 


Personality Tests. secouri errep st tE INALE ASARI Ghar Dept aetas 394 
Golin, E. and Lyerly, S. B. The Galvanic Skin Response as a Test of Advertising 
ML REC T A a O TO 440 
Goodman, C. H. The MacQuarrie Test for Mechanical Ability. IV. Time and 
Wrotior AnaS: ae aeaa ar EE bee mt E DEAA A EAA 27 
Gray, J. S. Custom Made Systems of Job Evaluation.....................005- 378 
Gray, J. S. and Prevetta, P. Fluorescent Light Versus Daylight................ 235 
Griffith, J. W., Kerr, W. A. and Mayo, T. B., Jr. Changes in Subjective Fatigue and 
Readiness for Work During the Eight-Hour Shift.........................5. 163 
Hay, E. N. Cross-Validation of Clerical Aptitude Tests......................4- 153 
Hay, E. N. The Application of Weber's Law to Job Evaluation Estimates........ 102 
Hayes, P. M., Jenkins, J. J. and Walker, B. J. Reliability of the Flesch Readability 
PORmMUldS. .riisfc Pci YTRRAASRWIDEGEIYURRTIIUEITFREATAIXEWGUIURVIRIRAYG 22 
Jenkins, W. L., Maas, L. O. and Rigler, D. Influence of Friction in Making Settings 
auuadudeds Sodl£y xao A A torso U a SUN lic ei ar ne E NE E REA EA NER 434 
Johnson, R. H. and Bond, G. L. Reading Ease of Commonly Used Tests........ 319 
Jones, F. N. and Smith, C. J. Visual Skill and Performance in a Meat Packing Plant 313 
Jones, M. H. The Adequacy of Employee Selection Reports...............ususe 219 
Jurgensen, C. E. Intercorrelations in Merit Rating Traits...................... 240 
Jurgensen, C. E. Overall Job Success as a Basis for Employee Ratings............ 333 
Kates, S. L. Rorschach Responses, Strong Blank Scales, and Job Satisfaction 
Among Policemen.: (ssa sariti Tantai aa UL VIII] BATEE amet IHR Nri NOTA ane 249 
Keating, E., Paterson, D. G. and Stone, C. H. Validity of Work Histories Obtained 
Bs te EVO isa coste MEE aon a eas a ashe ame D ene mA eR RE 
Kephart, N. C. and Besnard, C. G. Visual Differentiation of Moving Objects... - 
Kephart, N. C. and Mason, J. M. Acuity Differences between the Two Eyes and 
Job: Perbrmances s «5.0.99.4%3 93452 205m nw occ e e Mba eerte ta tease Ci a 423 
Kerr, W. A. Accident Proneness of Factory Departments....................+5 1 
Klugman, S.F. Spread of Vocational Interests and General Adjustment Status. ...- 108 
Layton, W.L. An IBM Card Profile for the Strong Vocational Interest Blank...... 41 
Levine, A. S. Construction and Use of Verbal Analogy Items. ......... esses 105 
Levine, A. S. Minnesota Psycho-Analogies Test................... i i i j i i i x g 300 
Link, H. C. and Freiberg, A. D. To What Extent Have the American People Ac- 
cepted SocialiSmiP, ,.. ie euo assez err ks c rw EI . 88 
Lord, F., Cowles, J. T. and Cynamon, M. The Pre-Engineering Inventory as à 
Predictor of Success in Engineering Colleges.............................. 30 
Mandell, M. M. The Administrative Judgment Test... 145 
Miller, H. K Jr. An Exploratory Study of Linear Interpolation... NNNM 24 
Mitchell, M. B. and Rothe, H. F. Validity of an Emotional Key on a Short Induct i 
Personality Questionnaire. ........ a rere S re má Sn eura 329 
De A le is Bs oa pe Towne 5 
Bse, à cnr ———À 1 
Se T Daneuupe Tue sec Mun mE mag. .—. Ë 
formance by Means of a Printed Test.............. ML CINURENIRUSM 309 
domm W. A. An Aptitude Test for Veterinary Medicine. : , , ; , , ene nnb 29$ 
Page, i. E. "Loon Norma fore Price eer e rec 1 
Lita i. eden tne Pr e Industrial Mathematics Test and the 306 
Pashalian, S. and Crissy, W. J. E. How Readable Are Corporate Annual Reports?. ". 


Contents of Volume 34 


Paterson, D. G. Report on the Journal of Applied Psychology for 1949............. 
Peatman, J. G. and Hallonquist, T. Geographical Sampling in Testing the Appeal 

Of Radio Broadcasts wm ceoporiste m E NI RIRSIMBSISY Vd enüdenru rd rdi e Summers 
Poruben, A., Jr. A Test Battery for Actuarial Clerks.......................... 
Pronko, N. H. and Herman, D: T. Identification of Cola Beverages. IV. Postscript 
Prothro, E. T. and Perry, H. T. Group Differences in Performance on the Meier 


Ramond, C. K., Rachal, L. H. and Marks, M. R. Brand Discrimination among 
CIPHTetto SMOKSLS oues rania sies s mania ice UTR NERIS D oi e N Meet d iuis 
Romney, A. K. The Kuder Literary Scale as Related to Achievement in College 
LL) ee eee eee eee eT ee Tree Pere re ok E ER D eee ee don ie 
Ross, S. A Study of Shooting Glasses by Means of Firing Accuracy............, 
Rothe, H. F. Use of an Objectivity Key on a Short Industrial Personality Question- 


MAIE Soc claus AAs PRD ARES TALIS OSE SK Ma T EI PU Rd ERE Vacuums 
Smith, A. J. Menstruation and Industrial Efficiency. I. Absenteeism and Activ- 


a a EE ee ee eT ee ee eee ee eee ee ee ee em 
d A. J. Menstruation and Industrial Efficiency. II. Quality and Quantity of 
Pbi E. 2 ca ands doa PIER oC ESS QN ce Vos AER ER oa GE S nes 
Stern, B. Upper versus Lower Case Copy as a Factor in Typesetting Speed for 
Linotype Trainees; reina nna eee he entree ni rr ehe quii p ACER E HE 
Travers, R. M. W. and Wallace, W. L. Inconsistency in the Predictive Value of a 


Battery ot TESE esaera pr ETSIN 6e Chennai Fe MarR ace S LRL ERDEN 


Wagner, R. F. Critical Requirements for Dentists. ......... siiis iussus 
Wallace, S. R., Jr. and Whitney, A. G. The Prediction of Persistency in Premium 


Weitz, J. Verbal and Pictorial Questionnaires in Market Research.............. 
Wesley, S. M., Corey, D. Q. and Stewart, B. M. The Intra-Individual Relationship 


between Interest and Ability ..............-.-- E ERR NND 
Wiener, D. N. and Simon, W. Personality Characteristics of Embalmer Trainees... 


Williams, A. C., Jr. and Roscoe, S. N. Evaluation of Aircraft Instrument Displays 


for Use with the Omni-Directional Radio Range (VOR). ........ ET. s 
Woods, W. A. and Boudreau, J. C. Design Complexity as a Determiner of Visual 


Attention Among Artists and Non-Artists.... sss. 


Book Reviews 


Allport's The Individual and His Religion: Paul E. Meehl...........-.......... : 


rential Psychology: Gladys C.Schwesinger. ............. 
n Business and Industry: Albert S. Thompson... . . 
Albert S. Thompson........ 


` Nastasi and Foley's Diffe | 
ellows’ Psychology of Personnel i pee 
ellows and Rush's Workbook in Personnel Methods: 
ennett and Cruikshank's A Summary of Clerical Tests: Donald E. Super. ....... 
°ynton’s Selecting the New Employee: Robert N. pi i Tr Mni rams 
rOuwer's Student Personnel Services in General Education: m ton E. Hahn. ...... 
alhoon’s Problems in Personnel Administration: Albert s. hompson............ 
van, Burgess, Havighurst, and Goldhamer's Personal Adjustment in Old Age: 
Albert R Chandler: «carcer parom wins Anta dinis 
Chapani on Morgan’s Applied Experimental Psychology. Human Fac- 


tors į 2 esign: Jack W. Dunlap...--.------- ee 
rake a E l a Casebook for Executives and Supervisors: 


Albert S. Thompson ds Det a E Ei i 439 rg 
esch? : Writing: James J. Jenkins. ......---....-.-.. 222. 
i mere d aes for the Profession of Nursing: Helen Nahm....... 


96 
282 


40 
118 


98 


237 
190 


131 
363 


193 
391 


123 


vi Contents of Volume 34 


Hoppock's Group Guidance; Principles, Techniques, and Evaluation: Milton E. 
EDO D acasqrisiescca ui anion AS scar KR ae pamela 5 ARKARRRERSERAPELEERRUÉRERE 


290 

Hovland, Lumsdaine, and Sheffield's Experiments on Mass Communication. Studies 
in Social Psychology in World War II, Vol. III: Allen L. Edwards............... 139 

Libo's Attitude Prediction in Labor Relations—A Test of Understanding: Edwin E. 
[ch SG tee 2 a8 Gir 5, ge aw COO C T TII LO DIDI OPHEPEPPP th Ae 371 
Massing’s Rehearsal for Destruction: Harrison G. Gough... 447 
Mathewson’s Guidance Policy and Practice: Donald E. Super. ...............0055 141 
McCord and Witheridge’s Odors: Physiology and Control: Josef Brozek............ 447 

Mossin’s Selling Performance and Contentment in Relation to School Background: 
Daniel Raslesberg. + aeeti ved neea a RDXGGXGPInIW4YLIAUASSUD e wales 287 
Pease's Machine Computation of Elementary Statistics: Kenneth E. Clark.......... 213 
Prentice-Hall, Inc. The New Cure for White Collar Unrest: Clifford E. Jurgensen.... 138 


Pressey's Educational Acceleration: Appraisals and Basic Problems: John W. Gustad 448 
Reynolds and Shister’s Job Horizons: C. E. Jurgensen 7 


n Rain MOS cho ge earn one 70 
Schramm's Mass Communications: Alfred C. Welch... aoaaa sus suse 446 
Stuit, Dickson, Jordan, and Schloerb's Predicting Success in Professional Schools: 
Barbaro Ae KiK s ee sesesspüstRiáckbirtzixeeeemugacÉec*Xuraccbctzts.U ow ES 445 
Super's Appraising Vocational Fitness: J. R. Wittenborn......................0: 136 
Tinkelman’s Difficulty Prediction of Test Items: Charles I. Mosier............+++ 452 


V XMPNRUECUS EAR Sa eae mae aie EA 70 
be ae Out-of-School Vocational Guidance: Fred M. Fowler.................. ++ 450 
arner, Gardner, Henry, and Haggard's Identifying and Develo: inv. Bote sntial 
Leaders: Ralph R. Canter, Jr................ M d ^ bliss scan 289 
Weitzman and McNamara's Gonstructing Clas: Examinations—A Gu ide for 
Teachers: Walter W. Cook. ......... s AR d , p wie ad 216 
Weston's Sight, Light and Efficiency: Miles A. Tinker......... TT . i : : No i 3 .. 449 
Williamson's Trends in Student Personnel Work: Albert S. Thompson. . i , , ; i ) " 


Miscellaneous 
New Books, Monographs, and Pamphlets 


V 


Journal of Applied Psychology 


Vor. 34, No. 1 


FEBRUARY, 1950 


Menstruation and Industrial Efficiency. I. Absenteeism 
and Activity Level * 


Anthony J. Smith 


University of Kansas 


A survey of the experimental work directly 
related to the influence of menstruation upon 
industrial efficiency reveals that investigators 
have been primarily concerned with absentee- 
ism, with much less emphasis placed upon pro- 
duction. Studies indirectly related to the prob- 
lem have dealt with such things as subjective 
Changes and performances of various types, 
Such as learning, steadiness, satiation, etc.! 

Many of these reports are to be criticized on 
Several grounds. Some of the studies yielded 
Seneralizations based on pitifully small num- 

ers of individuals. Social status and age 
Were often disregarded. The suitability of the 
Subjects, with reference to physical condition, 
Was often given less consideration than the 
availability of the subjects. Analyses often 
Contrasted menstrual with nonmenstrual data, 
rather than comparing the finer components of 
the menftrual cycle. Fluctuations occurring 

uring menstruation were sometimes overem- 
Phasized, whereas other equally great fluctua- 
tions were ignored. The factor of suggestion 
Was Occasionally uncontrolled. Preconceived 
Personal opinions at times seem to have been 
More powerful in determining conclusions than 

W actual data obtained. Statistical treat- 
ment of the data was inadequate in some 
Studies, almost completely missing in others, 
With the result that the basis for inference was 
absent or uncomfortably shaky. 

In spite of these criticisms, some facts seem 

* 
tion 
€ ri 


This material is derived from the author's disserta- 
Submitted in April, 1945 in partial satisfaction of 
quirements for the Ph.D. degree at the University 


: alifornia, Los Angeles. The author is deeply 
enigoted to Dr. Roy M. Dorcus for his guidance and 
sm, 


ata an excellent review of this material see the 
edy Georgene H. Seward (6). 


to stand out. Menstrual absences, despite 
occasional conflicting reports, would appear to 
be, in general, rather low and of much less im- 
portance to an employer than many other 
factors contributing to absenteeism. The 
numerous subjective changes, however (de- 
pression, increased emotionality, anxiety, fa- 
tigue, irritability, cramps, headache, etc.); 
sometimes classified as ‘“pre-menstrual ten- 
sion,” are reported by subjects with great fre- 
quency. It would not be illogical to hypoth- 
esize that these changes would be reflected in 
the efficiency of a woman worker. 


Procedure and Analytic Techniques 


In the present investigation, workers in the 
electrical department of an aircraft factory and 
and in two separate garment companies were 
Studied. The criteria of industrial efficiency 
^were absence rate, activity level, quality of 
production and quantity of production. The 
first two will be discussed in this paper. 


Aircraft Factory. Those women who served as sub- 
jects in the aircraft factory were working at various 
tasks such as assembling electrical equipment, soldering, 
and constructing wiring circuits. They were paid a 
straight wage with provisions for higher overtime rates. 
The full-time secretary of the group kept accurate 
records of all leaves of absence, transfers, absences and 
tardinesses, and these records were made available for 
analysis. In addition to the use of absence rate as a 
criterion of efficiency, it was decided to examine a 
characteristic of behavior often commented on but little 
studied, i.e., activity level. It has been widely assumed 
that a woman in her menstrual period exhibits a lower- 
ing in general activity, an idea probably stemming from 
an unsubstantiated belief that a considerable number of. 
women are forced to retire from general activity during 
this “sick” period. The lowering in activity level is 
presumably of such dimensions as to be readily obsery- 
able. If this were true, such a change should be appar- 


- bined in one group and the 


2 Anthony J. Smith 


ent to someone who has been specifically instructed to 
look for it. It was evident that the leadmen in the 
group were in most intimate contact with the women 
and would be best qualified to make judgments con- 
cerning their activity levels. Consequently, they were 
required to turn in daily ratings on a form requesting 
the checking at the end of the day of one of the follow- 
ing: (a) Very energetic and industrious; (b) Fairly 
industrious, quite active; (c) Works in a slow, leisurely 
manner; and (d) Relatively inactive, works very slowly. 
The leadmen, of course, were not acquainted with the 
purpose of the study. 

Menstrual data were collected through the coopera- 
tion of the personnel and physical education depart- 
ments. It was absolutely necessary that none of the 
women participating recognize the true nature of the 
experiment for the results might then more closely 
reflect the anticipated effects of menstruation than its 
actual effects. To accomplish this, the experiment was 
described as being concerned only with the duration of 
the menstrual cycle and of the menstrual phase of the 
cycle. 

A group of thirty-eight women between the ages of 
17 and 44 had been selected on the basis of 
arbitrary limit of 45 being set) and the availability of 
daily inspection records. These women were then 
assembled on company time and the prepared state- 
ment of the “purpose” was read by a member of the 
Personnel staff. It was emphasized that participation 
would be deeply appreciated but that no one should fail 
to assert herself should she feel disinclined to supply the 
information. All offered to cooperate. They were 
then informed that a daily contact would 
a member of the physical education department to 
ascertain the phase of the menstrual cycle. The women 
were only required to inform the person whether or not 
they were menstruating. 

Information on absences, ratings of activity level, and 
menstruation was gathered for a period of forty-one 
days, during August and September, 1943. By the end 
of this period the number of Women about whom suffi- 
cient information had been gathered to permit an 
analysis had been reduced to twenty-nine. 

For purposes of analysis, the menstrual cycle was 
broken down in two ways. First, a simple menstrual 
(period of fow)-nonmenstrual division was made, 
Second, the cycle was broken down into a five day 
Premenstrual period, a menstrual (bleeding) period, a 


seven day postmenstrual period, and an intermenstrual 
period. 


The effect of the menstrual c 
Was examined by deter: 


age (an 


be made by 


ycle upon absence rate 
mining the number of absences 
and number of days in attendance for all women for 


each of the four phases of the Cycle. These data were 
then recorded in a two-by-four table and the signifi- 
cance of the variations was tested by means of the chi- 
Square test. The nonmenstrual days were then com- 


i menstrual-n 
analysis was performed. Mis 
A group analysis of activit; 
1 s y level was mad 
analyzing the ratings of the leadmen, again eed Ed 
chi-square technique. Because of the £a 


low frequency 


with which rank four was assigned, it was necessary to 
combine ranks three and four. 

Parachute Factory. The second study was carried 
out in a local garment factory in which parachutes were 
being made. A survey of the plant revealed that it 
provided an ideal set-up for such an investigation as 
this. An overwhelming majority of the employees 
were women. There was a great diversity in the ty 
of work performed in the plant, permitting a wide 
sampling of tasks, and yet, except on rare occasions, the 
women continued to work at the same jobs. Moreover, 
three shifts were available for study. Finally, a piece 
rate system of payment was in use, with the result that 
excellent records were kept by the management. 

Forty-six women were selected for investigation so 
that there would be the most equitable distribution of 
women and jobs with reference to shift, age of em- 
ployee, mental difficulty of work, physical difficulty of 
work, and the necessity for standing or sitting continu- 
ously on the job. These women were contacted and 
agreed to cooperate. 

In addition to the analysis of the entire group, sep- 
arate analyses were also made for each of the following 
thirteen groupings: three shifts, three age groups, three 
levels of mental difficulty, two levels of physical diffi- 
culty, and standing vs. sitting jobs. 

Garment Factory. The last part of this investigation 
was carried out in a garment factory and was made 
possible through the cooperation of the personnel de- 
partment and the local labor union. 

The union gathered data on daily earnings, pre- 
sumably to obtain information about salaries that 
would permit of a more enlightened discussion at union- 
management arbitrations. These data were recorded 
on mimeographed forms covering weekly periods, from 
which daily attendance records could then be derived. 
. The workers were contacted by the author’s wife and 
information concerning menstrual cycles was requested 
with the same apparent "purpose" being quoted. At 


no time was any connection made hetween, menstrual 
records and earning records, 


Absences were then 


: v analyzed in the manner pre- 
viously described. 


Results 
Activity Level, 
tion of activity ley 
The analyses of th 
men cannot detect 
of working during t 
the four phases of 


Results from the investiga- 
el are presented in Table 1. 
ese data indicate that lead- 
any reliable change in rate 
he menstrual cycle. When 
A „P3 the menstrual cycle are ex- 
ae itis found that somewhat lower ratings 
hee re during the menstrual and premen- 
E phases. However, the probability value 
Y sponding to the obtained chi-square was 
iene the menstrual and nonmenstrual 
S were cont ili 
value was 40, tasted, the probability 
Would seem justifiable to conclude that 
differenc 


It 
if rea] i i 
es In activity level among the 


Menstruation and Industrial Efficiency. I 3 


Table 1 


Activity Level: Frequency of Assignment of Various 
Ratings During the Phases of the 
Menstrual Cycle 


, Pre- Men- Post- Inter- 
Rating menstrual — strual menstrual menstrual 
1 50 56 73 130 
(56.92)* — (61.56) (74.34) (116.16) 

2 66 70 95 121 
(64.83) (70.12) (84.67) (132.30) 

3-4 31 33 24 49 
(51.51) 


(25.24) (27,30) (32.97) 


* Values in parentheses refer to expected frequencies, 


various phases of the menstrual cycle actually 
exist, they would appear to be of such small 
magnitude as to warrant little or no considera- 
tion in this industrial situation. 

Absences. The criterion of efficiency most 
adequately studied during this investigation 
was the absence rate of the worker. A total of 
91 women acted as subjects and absence records 
for approximately 3800 potential working days 
were examined (man-days). The data for the 
Workers in each of the three factories are pre- 
sented in Table 2. It should be emphasized 
that menstrual absences are merely those that 
have occurred during the menstrual period and 
could conceivably be attributed to dysmen- 
orrhea, golds, injury, personal business, indif- 
ference, etc. If it had been possible to secure 
adequate data on causes of absences, it seems 
likely that the results would have been in close 
agreement with Gafafer's work, in which it was 
discovered that absences attributable to dys- 
menorrhea were responsible for an average 
loss of only .29 days per person per year (2). 

Information gathered at the aircraft factory 
revealed that menstrual absences occurred less 
frequently than might have been expected on 
the basis of chance, while premenstrual ab- 
Sences were more frequent. However, this 
trend was not significant (P>.30). 

The more exhaustive study of the total 
Broup at the parachute factory yielded low 
menstrual absence rates and high postmen- 
Strual absence rates. In the sub-groups no 
Significant differences were found in the men- 
Strual-nonmenstrual analyses. However, in 


the examination of the four component phases 
of the menstrual cycle, significance was en- 
countered in three instances. The women who 
worked on the second shift showed relatively 
high postmenstrual and low premenstrual ab- 
sence rates (P=.05). Women in the age range 
29-38 also manifested high postmenstrual 
and low menstrual absence rates (P= <.01). 
Finally, the women in the age range 39-50 had 
high premenstrual and low menstrual absence 
rates (P=.02). 

In the first two instances (second shift, ages 
29-38) excessive absences do not occur during 
those parts of the menstrual cycle that have 
been presumed to be most often characterized 
by the presence of unpleasant symptoms or 
disabling effects. The reasons for the appear- 
ance of high postmenstrual absence rate in 
these two cases are unknown but it is possible 
to speculate that it might be a function of the 
following factors: a tendency toward decreased 
social activity during the menstrual period, a 
decrease in the frequency of sexual relations 
during the period of flow (4), an increase in the 
frequency of intercourse during the postmen- 


Table 2 
Attendance Records for Total Groups 
Pre- Men- Post- Inter- 
menstrual  strual menstrual menstrual 
Aircraft 
Factory 
Absent 1t 5 12 14 
(7.80* — (813 (10.14) (15.83) 
Working 150 161 195 309 
(153.11) (157.87) (196.86) (307.17) 
Parachute 
Factory 
Absent 21 24 40 66 
(22.23) (26.18) (32.30) (70.39) 
Working 328 387 467 1039 
(326.77) (384.82) (474.70) (1034.61) 
Garment 
Factory 
Absent 1 1t 8 11 
(5.29) (6.43) (5.80) (13.55) 
Working 83 9t 84 204 
(78.71) (95.57) (86.20) (201.45) 


* Values in parentheses refer to expected frequencies, 


4 Anthony J. Smith 


strual period (4), an increase in the strength 
of the sexual drive during the postmenstrual 
period (1, 3), and a greater proportion of men 
working on the first and third shifts combined 
than on the second shift. 

For the final sub-group yielding significance 
(ages 39-50) it might seem at first glance that 
earlier conceptions are substantiated. How- 
ever, if the intermenstruum is accepted as a 
"normal" basis for comparison, this is not the 
case. The low menstrual absence rate offsets 
the high premenstrual rate. While the earlier 
conception is contraindicated, it would seem to 
be advisable to examine the high premenstrual 
rate for this older group in greater detail. 

At the garment factory results were en- 
countered that were significant at the 5 percent 
level. Here menstrual absences were high, 
premenstrual absences low. 

"There are several differences that exist be- 
tween the garment factory and the parachute 
and aircraft factories that might be responsible 
for the obtained differences in absence rates. 

, In the war industries there was the possi- 
bility that patriotic motives might play a part 
in determining both group attitudes toward 
absenteeism and the likelihood that the indi- 
vidual would consider that slight disc 
would justify staying home. 

The fact that the loss of a day would elimi- 
nate overtime payments for that week might 
te hn En o a mae okni, at 

s nonprogressive and of 
relatively short duration. In the garment 
factory a straight piece rate System was em- 


, B a 
ployed so absence caused no d Sproportionate 


omforts 


. Wages in general were appreciably 

in the war plants so that many of the better 
Workers had left the garment factory, and those 
that remained might have been less conscien- 


tious concerning attendance, particularly dur- 
Ing a. period of labor shortage. 
Differences 


backgrounds o 


higher 


The author does 
lon available to 
kers come from a 
or that they are 
iore | 1 menstruation has 
disabling effects, But in the light of the work 
which it was shown 
ts of menstruation 


tended to be greater among subjects from the 
lower socio-economic group, it would seem to 
be worthwhile to carry out further, more de- 
tailed investigations. 

Finally, facilities for overcoming dysmen- 
orrhea were more adequate in the parachute 
and airplane factories. The parachute factory 
had a nurse in attendance who treated the em- 
ployees and gave suggestions for the relief of 
any discomforts. Employees could come into 
the dispensary to rest and obtain help whenever 
necessary. In the aircraft factory this same 
service was available. Furthermore, groups of 
women met and were given special instructions 
in matters of hygiene and exercises that would 
be beneficial in relieving fatigue, menstrual 
discomforts, etc. This process might perhaps 
work both through direct physiological changes, 
and through suggestion. 

It should also be recalled that should no 
real differences exist in absence rate; approxi- 
mately five samples out of each hundred will 
yield significant differences, Whether or not 
such an explanation might be defensible here, 
it is, of course, impossible to say. 


Summary 


This phase of the investigation was con- 
cerned with the relationship between the com- 
ponents of the menstrual cycle and two meas- 
ures of industrial efficiency: activity level and 
absence rate. The subjects were 96 women in 
three separate factories. 

The data indicate that: 


1. There is no digcernible change in activity 
level related to menstrual function. tes 

2. Significant differences in absence 7^ e 
among the phases of the menstrual cycle ar : 
encountered in three subgroups at the ped 
chute factory and at the garment factory. Hig j 
rates of absence are not characteristically foun? 
m any one phase of the cycle. 


Received A pril 19, 1949, 


References 


1. Davis, Katherine B. Factors in the sex life of nm 
id e hundred women. New York: Harpers, 1 A 
: Salater, W. M. (Ed.) Manual of industrial hygien® 
and medical service in war industries. Phi" 
delphia: W, p, Saunders Co., 1943. 


 -————————— o LL IÓÀ—Á—sÓÁX——Ó—— ———— 


Menstruation and Industrial Efficiency. I 5 


Howell, W. H. Textbook of physiology. (14th Ed.) menstrual cycle on women workers. Psychol. 
Philadelphia: W. B. Saunders Co., 1941. Bull., 1944, 41, 90-102. 

~ McCance, R. A., Luff, M. C., and Widdowson, E. E. —7. Sowton, S. C. M., and Myers, C. S. (I) and Bedale, 

Physical and emotional periodicity in women. E. M. p.n Two aget p E RIS 

J. Hyg., Camb., 1937, 37, 571-611. mental study of the menstrual cycle. I. Its 


é : -— influence on mental and muscular efficiency. 
- Rider, P. R. An introduction to modern statistical IT. Its relation to general functional efficiency. 
methods. New York: John Wiley, 1939. Rep. industr. Fatigue Res. Board, No. 45, London, 

. Seward, Georgene H. Psychological effects of the 1928. 


Validity of Work Histories Obtained by Interview * 


Elizabeth Keating, Donald G. Paterson, and C. Harold Stone 


Industrial Relations Center, University of Minnesota 


The work of the employment interviewer, 
the vocational counselor, and the research 
worker in the field of labor mobility is de- 
pendent in large measure upon the validity of 
work histories obtained by interview. In view 
of the urgency of the problem and in view 
of the extensive literature on the interview, it 
is surprising to discover that so little evidence 
on the problem of validity has been reported. 


Review of Literature 


Search of the literature reveals only one 
study dealing directly with the problem of 
validity of work histories obtained by inter- 
view and in which tabular data are presented. 
Creamer and Coulter (7) in reporting a survey 
of unemployed mill workers in Manchester, 
New Hampshire, gave data on validity by 
comparing work history interview information 
with employer records. 

The fact that only one study of this specific 
problem giving quantitative evidence has been 
unearthed is most surprising in view of the 
voluminous literature on personnel administra- 
tion, personnel psychology, industrial relations, 
and labor market mobility. Furthermore, 
when one stops to realize the millions of situa- 
tions in which routine employment practice 
requires "recording of work history of appli- 
cants and checking with previous employers" 
it is amazing to find no reports concerning the 
validity of such work histories. Books on 
personnel management such as those by Tead 
and Metcalf (23), Shefferman (21), Scott. 
Clothier et al. (20), Baridon and Loomis (2), 
Knowles and Thompson (11), Walters (27), 
Yoder (28), Pigors and Myers (18) and Jucius 


* The original data were s 
Phe oi ecure: 
Stabilization Research Institute of 


i t Locai 
Minneapolis: University of 


of Dr. Dale Yoder, Director 
nter, and Dr. H. G. Heneman, 


The present report was pre- ` 


(10) have been searched in vain for cited 
evidence. Books on personnel psychology 
such as those by Link (13), Hollingworth (8), 
Viteles (26), Laird (12), Jenkins (9), Burtt (5), 
Maier (14) and Tiffin (24) are likewise unre- 
warding so far as evidence on this specific prob- 
lem is concerned. Emphasis in these books is 
placed on the unreliability and low validity of 
the interview as a selection and placement 
device. Jenkins (9) and Laird (12), however, 
do mention validity of work histories obtained 
by interview and in both instances the im- 
pression is given that validity is low, but again 
evidence is not cited. 

The most valuable treatment of the inter- 
view in the literature is that by Bingham and 
Moore (3). They present an excellent dis- 
cussion of the problem of validity but the 
specific evidence cited concerns only their own 
study showing a high degree of validity of 
verifiable aspects of interviews with employees 
concerning employment guarantee plans an 
their attitudes and opinions toward such plans. 

The literature on labor market mobility 
and studies of the unemployed in the labor 
market is likewise singularly free from detailed 
evidence concerning validity of work "histories. 
Bowley and Burnett-Hurst (4) recognized the 
importance of the problem, collected pertinent 
data but failed to present the details. dd 
merely stated: “Wage statements were chee ite 
in a considerable number of cases by dan 
facts from the employers, and the results sho 
that there was no evident bias in the direction 
of overstatement or understatement, thous, 
there were mistakes” (4, p. 174). Bancroft 
skirts the fringes of the problem by reporting 
a check on the accuracy with which gend 
ployed workers on relief provide interviewer 
with statements concerning relief grants t 
date of latest registration at public employmen 
offices. Palmer (17) is concerned with 
reliability and consistency of responses £ 
in labor market inquiries but cites no evidence 
on validity determined by checking with r€C 


the 
jven 


heo a ie 


E 


Validity of Work Histories Obtained by Interview 7 


ords of previous employers. Even the ex- 
cellent studies by the Yale researchers as 
reported by Reynolds and Shister (19) and by 
Noland and Bakke (16) fail to cite any evidence 
in regard to the validity of the work histories 
obtained in their extensive series of field 
interviews. ` 

Clague, Couper and Bakke (6) tested the 
accuracy of workers’ statements about prior 
jobs against employer records but failed to 
present quantitative evidence. Instead, they 
state: “As a result of these tests we have come 
to the following conclusions. A vast majority 
of the workers answer as accurately as they are 
able to do so. Only a very small fraction of 
the schedules gave evidence of attempts to 
mislead the interviewers . . . The replies with 
respect to wage rates were on the whole 
excellent . . . On time intervals, however, he 
is much less sure of himself . . . On the whole 
the data as presented in this study are accurate, 
and if the necessary corrections could be made 
the results would not be materially changed" 
(6, 129). 

Myers and Maclaurin (15) report that, in 
addition to information secured directly from 
company records, they interviewed a sample 
of 233 workers and checked their answers 
against company records. Again, detailed 
tabular evidence is not given but statements 
are made which seem to imply low validity of 
work history reports. For example, “More 
than a third did not mention jobs which they 
were known to have had . . . It was seldom 
that a job held as long as six months was not 
mentioned by the worker. For this reason, 
we are inclined to attribute these omissions to 
‘poor memory.’ . . . There were errors in esti- 
mating the length of particular jobs, and 
Workers tended to overestimate rather than 
underestimate the lengths of the jobs they had 
held in the preceding three years. . . . A fre- 
quent discrepancy was found between the size 
of the weekly paycheck which the worker 
claimed to have earned and the amount which 
his employment records showed he had actually 
earned. In those cases where a comparison 
Was possible, more than five times as many 
Workers overestimated their earnings as under- 
estimated them. Usually the difference was 
Small, but the tendency was nonetheless im- 


portant. . . . One conclusion that stands out 
from this analysis of discrepancies is that 
workers’ memories or statements cannot be 
relied upon for detailed factual information 
concerning their work experience. They are 
frequently unable to remember all the jobs 
they have had, the order in which they had 
them, the dates of employment, the length of 
their jobs, and their earnings." 

Creamer and Coulter's report (7) was a 
WPA study of the accuracy of work histories 
given by unemployed textile mill workers, 
90 per cent of whom were foreign born. Re- 
cords of 227 persons interviewed in their homes 
were compared with employer's records for the 
period 1930-1934. Results were reported by 
four tenure groups: 


1. Those reporting one job with continuous 
employment during the 5 year period studied. 

2. Those reporting one job but with one or 
two periods of no employment. 

3. Those reporting more than one job with 
some periods of no employment. 

4. Those reporting no employment during 
the five year period. 


Results showed that only 14.1 per cent of 
the 227 cases reported their duration of employ- 
ment with “complete accuracy." More than 
two-thirds of the cases involved over-statement 
of the length of employment. The conclusion 
is drawn that "this evidence suggests that in 
any industry characterized by intermittent em- 
ployment, a work history of the fairly recent 
past based on the memory of the worker will 
tend to minimize the degree of intermittency 
by understating the number of jobs and by 
overstating the total duration of employment 
by appreciable amounts” (5, p. 342). 

It is difficult to evaluate the Creamer and 
Coulter study in a straightforward manner. It 
is based upon home interviews of 227 former 
employees of a textile mill. The interviews 
apparently took place in November 1936 and 
covered prior employment with the Amoskeag 
Textile Mill extending backward in time from 
December 1934. The qualifications and train- 
ing of the interviewers are not stated. Fur- 
thermore, employment in this particular com- 
pany was characterized by “intermittency” so 
that it is little wonder that former employees 


8 Elizabeth Keating, Donald G. Paterson, and C. Harold Stone 


who were largely semi-skilled and foreign born 
were hazy and inaccurate in recalling the 
number of jobs held in the company and the 
duration of unemployment during each of the 
five years preceding December 1934. A fur- 
ther difficulty is that the results are presented 
in statistical tables which provide only totals 
and averages. These are difficult if not im- 
possible to understand, let alone interpret. 
While the general impression of the low validity 
of work history data as given is probably cor- 
rect, one should keep in mind that no precise 
index of validity is available and that the data 
themselves are provided by interviewers and 
interviewees under conditions that would be 
expected to promote a maximum of memory 
error. 

In summary, the reports by Bowley and 
Burnett-Hurst and by Clague, Couper and 
Bakke give an impression of high validity 
whereas the reports by Myers and Maclaurin 
and by Creamer and Coulter give an impression 
of low validity. It is obvious that the im- 
portance of the problem in social science re- 
Search and in personnel practice is such as to 
warrant citation of detailed evidence in such a 
manner that no one is forced to rely on author's 
conclusions but can interpret the evidence 
directly. There is urgent need for this kind of 
evidence. It is the aim of the present report 
not only to give this kind of evidence but also 


to give it in a form that is clear 


1 and un- 
ambiguous. 


Procedure in the Present Study 


'The research reported in this paper is 
based on data gathered in studies of the oc- 


Cupational competence of unemployed persons 
in St. Paul conducted by the Employment 
T zation Research Institute in 1940-1942 


A random sample of unemployed per- 
Sons who registered fi 


or employment at the St. 
Paul office of the USES during the period 
September, 1941, through February, 1942, was 
studied. f 


Extensive personal and occupational his- 
tories were secured from the registrants by 
interviewers on the staff of the Research Insti- 


1 For further detailed discussion of the sample, see 


29, p. 129 fi. 


tute? Specific information was obtained con- 
cerning the nature of job duties, dates of em- 
ployment and separation, wage rates, reasons 
for separation, names of previous employers, 
etc., for every job held by each registrant since 
his entry into the labor force. Interviews 
were conducted in a clinical situation as part 
of a comprehensive program of individual case 
analysis. Those interviewed were unaware 
that their statements were to be subjected to 
independent verification. However, each per- 
son's full cooperation was obtained by inform- 
ing him that the results of the interviewing 
and testing would be used not only for research 
but more importantly for counseling to enable 
the person to obtain the type of job and/or 
training for which he appeared to be fitted. 
Verification and amplification of inter- 
viewees’ statements were obtained by personal 
interviews with employers in the Twin City 
area by an “employer contact man" on the 
staff of the ESRI.’ Information was also ob- 
tained by mail from firms outside the "eod 
politan area. Using Ti oops’ “follow-up” tec E 
nique (25), an 80 per cent return was obtaine 
on mailed inquiries. i 
Verification of employees’ statemen 
against company records calls for pep ch 
of the possibility of error in company recorc P 
However, employer contacts in the quei 
made with great care and wherever a possibrt y 
of inaccuracy was noted by the employer con 
tact man, this fact was stated in the case 
history report. Data from employers pem 
noted as questionable were not used. In ie 
study, consequently, discrepancies between t » 
Statements of former employees and the em 
Ployer are assumed to be the result of dl 
accurate information provided by the iced 
employee. However, errors on the part of t^ 
interviewers in securing and recording WO" 
histories may have been involved. d 
Complete case histories for 385 unemploy® 


? These interviewers were first-year graduate, hed 
dents specializing in personnel psychology who ha 
one or more courses dealing at least in part with ter- 
general principles of interviewing. Before actual wr yen 
viewing in the employment office began, each was £ er- 
à minimum of one day of detailed instruction and sup 
Vised practice in conducting the type of interview! 
required in this Study. Close supervision was © 
tinued throughout the investigation. igned 

enry Morgan of the Research Staff was assign 

the responsibility of securing employer reports. 


` — 


Validity of Work Histories Obtained by Inlerview 9 


were available. Case histories for 236 reg- 
istrants (157 men, 79 women) contained suff- 
cient information to permit their inclusion in 
the present study. Employer reports on 373 
jobs held provided information in one or more 
of the areas studied i.e., duration of job, weekly 
wage received, and nature of duties performed. 

Because memory distortion would be ex- 
pected to operate differentially in terms of the 
time these 373 previous jobs were held, it was 
necessary to classify each by a time interval 
category. Thus, a 0-12 months category in- 
cluded all jobs terminated within one year 
prior to the interview. The 13-24 months 
category included all jobs terminated from 13 
months to 2 years prior to the interview, and 
soon. This breakdown was designed to reveal 
differences in validity of the occupational 
history between the immediate and more dis- 
tant past. 

Case histories of men and women were kept 
Separate to facilitate study of sex differences 
in validity of job reports. Scatter diagrams 
for each time-interval were plotted to show 
the relation of the employer contact report to 
the interview record with respect to: 1. Weekly 
wages (male and female); and 2. Duration of 
job (male and female). 


Results for One Year Time Interval 


Wages. Table 1 shows the relationship be- 
tween reported weekly wages and verified 
weekly wages for male workers on jobs held 
within one year of time of interview. Table 
2 presents the same data for the women 
workers. Validity coefficients of +.90 and 
+.93 (based on ungrouped data) may be re- 
garded as justifying the acceptance of wage 
data reports obtained under the circumstances 
surrounding this investigation with consider- 
able faith. Furthermore, neither men nor 
Women showed any consistent tendency to 
over-state or to under-state their wages. 

Length (Duration) of Job. Distribution of 
data for jobs on which information was 
available on job duration is given in Tables 3 
and4. Again, the agreement between worker 
Teports and employer records is close as re- 
vealed by validity coefficients of +.98. 

‘These cases range from “professional” to “un- 


skilled.” See Local labor market research (29, 140-141) 
for distribution according to primary job classifications. 


Table 1 


Relationship between Reported Weekly Wages and 
Verified Weekly Wages for 116 Male Workers 
on Jobs Held 0-12 Months Prior to 
Interview (1940-1941) 


Note: Pearson r for ungrouped data = 4-.90. 


Reported Verified Weekly Wages 
Weekly Total 
Wages | 0-9 fio-19 20-29/30-39| 40-4950-50 [60-69 
60-69 | 1 1 
50-59 — 1 
40-49 | T 7 
30-39 | . | 19 
20-20 | +13 1 50 
10-19 | 2 26 | 4| | 32 
oo [1 T 5 
Total | 6 | 31] 48/21} 6 | 1 | 2 | 115* 


* One case not shown: reported weekly wage, $105; 
verified weekly wage, $100. 


Table 2 
Relationship between Reported Weekly Wages and 
Verified Weekly Wages for 61 Female Workers 
on Jobs Held 0-12 Months Prior to 
Interview (1940-1941) 
Note: Pearson r for ungrouped data = +,93. 


Reported Verified Weekly Wage 
Weekly Total 
Wee | osa | 5-9 10-14 15-19] 20-24] 25-29 
25-29 | ice 2 
20-24 3 3 
15-19 4 19 24 
10-14 24 3 27 
5-9 3 1 4 
04 [2 2 
Total | 2 | 3) AE | a | 2 | 6 


Duties of Job. In the 0-12 months time 
interval category, there were 144 male cases 
which contained data sufficient for study. 
Agreement between the interviewee's state- 
ment and the employer report of job duties 
was found in 94 per cent of the cases. There 
was a 96 per cent agreement for the 68 women. 
The nature of the data are such as to preclude 
the presentation of "scatter tables" or the 
computation of correlation coefficients. 

In the small percentage of cases where the 
duties of the job as stated by the interviewee 


10 Elizabeth Keating, Donald G. Paterson, and C. Harold Stone 


Table 3 


Relationship between Reported and Verified Duration 
of Employment in Months for 127 Male 
Workers on Jobs Held 0-12 Months 
Prior to Interview 


Note: Pearson r for ungrouped data = +.98. 


Reported Verified Duration (Months) 
Duration j Total 
(Months) | 019 | 20-39] 40-59 60-79| 80-9] 100-119) 
80-99 | 2 1 3 
60-79 6 7 
40-59 | | 2 T 
20-39 14 16 
0-19 | 93 93 
Total | 93 | u| 8 | 8 | 2 | 1 |126 


* One case not shown: reported duration, 192 months; 
verified duration, 192 months. 


Table 4 


Relationship between Reported and Verified Duration 
of Employment in Months for 64 Female 
Workers on Jobs Held 0-12 Months 
Prior to Interview 


Note: Pearson r for ungrouped data = -L.98. 


Reported Verified Duration (Months) 

Duration 

(Months) p iii 

0-19 | 20-39| 40-59 c0-79] #0-59| 100-119) 
-|_——_|___| ———SM—— 

100-119 | 1 1 
80-99 | 1 

60-79 1 | 

40-59 | 


20-39 
0-19 


Total 


did not agree with the employer contact re- 
ports, there was a tendency to inflate the level 
of skill and responsibility in their jobs. 


Results for the Longer Time Intervals 


Because the number of cases i i 
the Ses involved in 
reports for time intervals greater than one year 


is small the results are pres i 
i ented 
correlation coefficients bin EC 
Wages. Table 


In 


nds 


to six years. Of course, the numbers of cases 
is so small that the correlation coefficients 
must be accepted only with great reservations. 
The most that can be said is that the evidence 
does not indicate any definite drop in validity 
with the passage of time. 

Duration of Employment. Table 6 also 
indicates that duration of employment is re- 
ported with a surprisingly high degree of 
validity even though time intervals from one 
to six years are involved. Of course, the 
numbers of cases involved are small and the 
correlations are not to be accepted at face 
value. Nevertheless, there is no evidence of 
decreasing validity with lapse of time. 

Duties of Job. Data on reports of duties 
of the job, held at time intervals prior to inter- 
view greater than one year, were found to be 
surprisingly accurate. It is only the occasional 
applicant for employment who distorts his 


Table 5 
Correlations between Reported Weekly Wages and 
Verified Weekly Wages for Male and Female 
Workers on Jobs Held at Time Inter- 
vals Greater Than One Year 


Time Interval 

(Job Termina- Men Women 

tion Date Prior -— 3 
to Interview) Number 7 


Number r 


13-24 Months — 24 95 TE 

25-36 Months — 10 — 94 8a 2 

37-48 Months 19 62 e* 9i 

49-60 Months 5 .99 = jai 

61-72 Months 8 416 = = 
Table 6 


Correlations between Reported and Verified Duration of 
Employment for Male and Female Workers 
on Jobs Held at Time Intervals Greater 
than One Year 


Time Interval 
(Job Termina- 
tion Date Prior 

to Interview) 


Men Women 


Number + 


Number ^" 
13-24 Months 33 9 


9 .99 
25-36 Months — 16 v 1 99 
37-48 Months — 22 94 7 % 
49-60 Months 12 .97 = ai 
61-72 Months 7 .88 = 


Validity of Work Histories Obtained by Interview 11 


previous work history under the conditions 
surrounding the present investigation. 


Summary 


In the extensive literature on the interview 
surprisingly little evidence exists with respect 
to the validity of work histories obtained by 
interview. Only one study was found bearing 
directly on the problem and this study was 
seriously defective. 

The present study of applicants for employ- 
ment in the St. Paul Office of USES was con- 
ducted on a research basis in a setting in which 
occupational guidance was stressed. Pre- 
sumably little incentive existed for these appli- 
cants to distort their work histories. The 
validity of the work histories when checked by 
employer reports was found to be surprisingly 
high with respect to weekly wages, duration of 
employment and job duties. Validity re- 
mained high for histories secured for jobs held 
up to six years prior to the interview as well as 
for jobs held just prior to the interview. In 
terms of correlation coefficients, the validities 
may be generalized as being from +.90 to +.98. 


Received October 17, 1040. 
Early publication. 


References 


1. Bancroft, G. Consistency of information from 
records and interviews. J. Amer. statist. Ass., 
1940, 35, 377-381. 

2. Baridon, Felix E., and Loomis, E. H. Personnel 
problems. New York: McGraw-Hill Book Co., 
1931. Pp. 25-29, 

3. Bingham, W. V., and Moore, B. V. How to inter- 
view. New York: Harper and Brothers, 1934 
(rev. ed.). Ch. 12, pp. 190-216, (The 1942 
revision omits the report of their validity study.) 

4. Bowley, A. L., and Burnett-Hurst, A. R.  Livdli- 
hood and poverty. London, England: G. Bell and 
Sons, Ltd., 1915. Pp. 222. 

5. Burtt, H. E. Principles of employment psychology. 
New York: Harper and Brothers, Rey. Ed., 
1942. Pp. 405-475. 

6. Clague, E., Couper, W. J., and Bakke, E. W, After 
the shutdown. New Haven: Institute of Human 
Relations, Yale University, 1934. 

7. Creamer, D., and Coulter, C. W. Labor and the 
shut-down of the Amoskeag Textile Mills. Phila- 
delphia: Work Projects Administration, National 
Research Project Report No. L-5, November, 
1939. Pp. 342. 


8. Hollingworth, H. L. Judging human character. 
New York: D. Appleton and Co., 1923. Pp. 
263. 

9. Jenkins, J. G. Psychology in business and industry. 
New York: John Wiley and Sons, Inc., 1945. 
Pp. 72-78. 

10. Jucius, Michael J. Personnel management. Chi- 
cago: Richard D. Irwin, Inc., 1947. Pp. 175- 
200. 

11. Knowles, A. S., and Thompson, R. D. Industrial 
management. New York: The Macmillan Co., 
1944. Pp. 334-336. 

12. Laird, D. A. The psychology of selecting employees, 
New York: McGraw-Hill Book Co., Inc., 1937 
(3rd ed.). Pp. 99-119. 

13. Link, H. C. Employment psychology. 
The Macmillan Co., 1921. Pp. 435. 

14. Maier, N. R. F. Psychology in industry. Cam- 
bridge: The Riverside Press, 1946. Pp. 426. 

15. Myers, C. A., and Maclaurin, W. R. The move- 
ment of factory workers. New York: John Wiley 
and Sons, Inc., 1943. Pp. 111. 

16. Noland, E. W., and Bakke, E. W. Workers wanted. 
New York: Harper and Brothers, 1949, Pp. 
224. 

17. Palmer, G. L. The reliability of response in labor 
market inquiries. Washington, D. C.: Technical 
Paper No. 22, Executive Office of the President, 
Bureau of the Budget, Division of Statistical 
Standards, July, 1942. 

18. Pigors, P., and Myers, C. A. Personnel adminis- 
tration. New York: McGraw-Hill Book Co., 
Inc., 1947. Pp. 497. 

19. Reynolds, L. G., and Shister, J. Job horizons. 
New York: Harper and Brothers, 1949. Pp. 
102. 

20. Scott, W. D., Clothier, R. C., et al. Personnel 
management. New York: McGraw-Hill Book 
Co., Inc., 1941, 

21. Shefferman, N. W. Employment methods. New 
York: The Ronald Press Co., 1920. 

22. Stead, W. H., and Shartle, C. L. Occupational 
counseling techniques. New York: American 
Book Co., 1940. Pp. 212. 

23. Tead, O., and Metcalf, H. C. Personnel adminis- 
tration. New York: McGraw-Hill Book Co., 
Inc., 1933 (3rd ed.). 

24. Tiffin, J. Industrial psychology. 
tice-Hall, Inc., 1947 (2nd ed.). 

25. Toops, H. A. The returns from follow-up letters 
to questionnaires. J. appl. Psychol., 1926, 10, 
92-101. 

26. Viteles, M. S. Industrial psychology. New York: 
W. W. Norton and Co., Inc., 1932. 

27. Walters, J. E. Personnel relations. 
The Ronald Press Co., 1945. 

28. Yoder, D. Personnel management and industrial 
relations. New York: Prentice-Hall, Inc., 1948 
(3rd ed.). Pp. 894. 

29. Yoder, D., and Paterson, D. G., et al. Local labor 
market research. Minneapolis: University of 
Minnesota Press, 1948. 


New Vork: 


New York: Pren- 


New York: 


Study of Executive Leadership in Business. 
II. Social Group Patterns 


C. G. 


Browne 


Wayne University 


This is the second in a series of articles 
presenting the following methods for the study 
of executive relationships and leadership in 
business: RAD index, social group patterns 
and organizational contacts, sociometric pat- 
tern, and Goal and Achievement index. . 

The total study proceeded on the following 
hypotheses: (1) leadership is a process based 
upon the inter-relationships of individuals in a 
group that is working toward a goal which has 
been accepted as desirable; (2) executive func- 
tion and leadership in business are processes of 
the interaction of social and working relation- 
ships within and outside of the executive 
groups; (3) executive and leader relationships 
can be analyzed through the application of 
methods which are not designed to meas- 
ure personal executive traits as psychological 
entities. 

The subjects for the study were 24 execu- 
tives in a tire and rubber manufacturing com- 
pany, representing all but one of the executives 
on the first, second, third, and fourth echelons 
of the business. They were classified into the 
following departmental groups: general admin- 
istration, 4 cases; sales, 6; finance, 4; manu- 
facturing, 8; and personnel, 2. 


Social Group Patterns 


As part of a longer interview, each executive 
indicated his first, second, and third choices of 
the following social groups in terms of their 
importance in his social life: group 1, indi- 
viduals within the company, called the Com- 
pany group; group 2, individuals outside of 
the company but with whom there was a 
business affiliation with the executive's own 
work and called the Outside and Business 
group; and group 3, individuals outside of the 
company with whom there was no business 
affiliation, called the Outside Only group. 

! Browne, C. G. Study of executiv 


business. I. The RAD index. 
33, 521-526. 


e leadership in 
J. appl. Psychol., 1949, 


Each executive also indicated the amount of 
social activity which he had with each social 
group using a three-point scale, the intervals 
designated as “large amount,” “some,” and 
“no” social life. 

In calculating the score on social group 
patterns for each executive, the first social 
group choice was given six points, that is, the 
social group which was most important 1n the 
social life of the executive; second choice was 
given four points; and third choice, two points. 
For the amount of activity with each social 
group, “large” was assigned four points; 
“some,” two points; and “no,” zero points. 
The score for each executive was the gane 
of the points allowed for the order of d 
social group choice and the amount of socie 
activity with each group. i 

In "Table 1 the dal group pattern ora 
are tabulated by total and departmenta 

: -ecntives.? For the total execu- 
groupings of executives. Md rasis 
tive group and the sales, finance, "n weed 
facturing departments, the three socia t Et 
ranked Outside Only, Company; Outsice i is 
Business. However, the general adams $ f 
tion group ranked the Company social Lea 
first, followed in order by Outside and ce 
and Outside Only. Whereas Outside E : 
Business was the second social group with s 
general administration executives, it was M : 
third or least important group with all of t^ 
other executive groups and the total grouP- 
However, Outside and Business was a relative y 
more important social group with the sales 
executives than it was with finance or manu" 
facturing. The finance executives indicated n° 
Outside and Business social contacts, but the!" 
score for Outside Only social contacts WaS 
higher than for any other executive group: — | 

In their total social activities, the gener? 
administration executives had the highest 
standing, followed in order by sales, manu- 

* The personnel executives were included in the manu- 
facturing group. 
12 


hd d 


Study of Executive Leadership in Business. II 13 


Table 1 


Social Group Pattern Scores Tabulated by Total and 
Departmental Groups of Executives 


Executive 
Social Depart- Mean Median Total 
Group ment N Score Score Points 
Company Total 21 112 85 268 
GA* 4 18.0 20.0 72 
s 6 100 8&0 60 
F 4 8.0 8.0 32 
M 10 10.4 8.6 104 


Outsideand Total 24 52 0.9 124 
Business GA 4 11.0 10.0 44 

S 6 9.3 4.0 56 

F 4 0.0 0.0 0 
0 1.6 0.8 16 


Outside Only Total 24 13.5 12.0 324 


GA 4 8.0 8.0 32 
S 6 13.3 16.0 80 
F 4 15.0 12.3 60 
M 10 13.6 11.7 136 


* Code: GA—General Administration; S—Sales; F— 
Finance; M—Manufacturing. 


facturing, and finance. Both general admin- 
istration and sales were above the mean and the 


` median for the total group, while finance and 


manufacturing were below. 

From this analysis it appeared that a 
general social pattern developed which was 
based upon the type of work which the execu- 
tive did or the department to which he was 
assigned. This pattern centered around the 
relative importance of the company or business 
contacts in social activities, the increase in 
their relative importance being observed in 
the general administration and sales depart- 
ments. On the other hand, there was a com- 
plete absence of social business contacts in the 
finance department. A consideration of the 
individual executive's duties and work per- 
formed revealed that the social group choices 
followed the lines of necessity to a large degree 
in that certain types of social activity became 
necessary and essential to the performance of 
the individual as well as the departmental 
functions. 

In Table 2, the social group pattern scores 
are tabulated by echelon level in the business. 


The president and general-manager, who con- 
stituted echelon one, had the highest possible 
score for each of the three social groups, indi- 
cating a large amount of social life with each 
social group. For him, the groups ranked as 
follows: Outside and Business, Company, Out- 
side Only. 

Echelon two was second in the total amount 
of social activity score, with the social groups 
ranked Company, Outside Only, Outside and 
Business. Echelon four preceded three in the 
total amount of social activity, this being 
strongly influenced by the scores of four sales 
executives who were in the fourth echelon. 
It was observed in the previous section that 
sales executives tended to have more social life. 
Both echelons three and four ranked the social 
groups Outside Only, Company, Outside and 
Business. 

Thus, the executives at higher echelons ex- 
hibited a closer identification with the company 
through their social activities, the Company 
and Outside and Business social groups being 
more important than the Outside Only group. 
When this is combined with the departmental 
pattern of social choices, it appears that execu- 
tives in general administration and sales and 
those in the higher echelons throughout the 
business had more social activity, particularly 
with those social groups which had a relation- 
ship with company operations. 


Table 2 


Social Group Pattern Scores Tabulated by 
Executive Echelon Level 


Social Echelon Mean Median Total 
Group Level N Score Score Points 
Company One 16.0 16.0 16 


1 
Two 7 14.3 8.7 100 
Three 10 9.2 8.3 92 
Four 6 10.0 8.3 60 


Outside and One 24.0 24.0 24 


1 
Business Two 7 7.0 3.5 48 
Three — 10 2.0 0.6 20 
Four 6 3.3 3.7 20 
Outside Only One 1 8.0 8.0 8 
Two 7 10.3 11.3 72 


Three 10 15.2 12.3 152 
Four 6 15.3 140 . 92 


14 C. G. Browne 


Organizational Contacts 


To study organizational contacts, each 
executive named all of the professional and 
social organizations to which he belonged, and 
indicated which memberships, if any, were 
paid by the company. Table 3 presents these 
results by executive department and echelon. 
The general administration group indicated 
the greatest number of social organization 
memberships per executive, followed by per- 
sonnel, sales, finance, and manufacturing. It 
should be remembered, however, that the per- 
sonnel group consisted of only two cases. 
However, the company was paying for social 
memberships for certain executives in the 
general administration and sales areas only. 
A study of the duties of these executives re- 
vealed that their activities with social organiza- 
tions constituted an important aid to their 
company functioning. Obviously, then, the 
company was willing to carry the expense of at 
least some of these organizations. 

The personnel group ranked highest in the 
number of professional organizations per execu- 
tive, followed by general administration, 
finance, manufacturing, and sales. Of the six 
executives in the sales group, only the vice- 
president-sales and the sales manager belonged 
to any professional organizations. Each was 
a member of one organization, but these mem- 


berships were not paid by the company. In 
finance, both the treasurer and the comptroller 
belonged to two professional organizations, and 
in each case, both memberships were paid by 
the company. The chief cost accountant be- 
longed to three professional organizations, none 
of which were paid by the company. In the 
manufacturing group, although 12 professional 
memberships were held, only one of them, for 
the chief chemist, was paid by the company. 

When the organizational memberships, and 
memberships paid by the company were ana- 
lyzed on the basis of executive echelon groups, 
Table 3 shows a decrease in each variable from 
echelon one through echelon four. That 1S, 
executives in successively lower echelons be- 
longed to fewer social organizations, fewer pro- 
fessional organizations, and fewer memberships 
were paid by the company for each type 9 
membership. It might be said that as an ex- 
ecutive advanced in this company into ipi 
echelons, it could have been expected that ne 
would become à member of more organizations, 
and that his work would be such that n 
company would be willing to pay for Boks p 
these memberships. On the other hand, it may 
have been that the organizational activities 0 
these executives were characteristic of the 1- 
dividual executive operating in his own parti- 
cular pattern and were not a particular function 
of his position in the business. 


Table 3 » 
Membership in Social and Professional Organizations Tabulated by Executive Department and Echelon 
Social Professional Total . 
Memberships Memberships Memberships 
(mean) (mean) (mean) 
Paid by Paid by Paid by 
7 E N Number Co. Number Co. Number Co. 
Executive Departmental Groups - 
Total group 24 2.46 46 1.67 67 4.13 1.13 
Genl. Adm. 4 4.50 1.75 2.75 1.75 6.25 3.50 
Sales 6 2.50 67 33 0.00 2.83 67 
Finance 4 1.75. 0.00 1.75 1.00 3.50 1.00 
Manufacturing 8 1.50 0.00 1.50 42 3.00 12 
Personnel 2 3.50 0.00 4.00 2.00 7.50 2.00 
Executive Echelon Groups 
Echelon 1 1 1000 5.00 400 4.00 14.00 9.00 
Echelon 2 7 2.57 .86 1.86 1.00 4.43 1.86 
Echelon 3 10 2.10 0.00 1.80 40 3.90 40 
Echelon 4 6 167 000 67 a 2.34 1T 


` Study of Executive Leadership in Business. .II 15 


Conclusions 


From this study of the social group patterns 
and the organizational contacts of a population 
of 24 executives in a tire and rubber manufac- 
turing company, the following hypotheses may 
be advanced: 

1. The social group choices and the amount 
of social activity of a business executive are in 
part determined by: (1) the work which he 
does within the business; (2) the department 
in which he works; (3) the echelon level into 
which he is classified within the organization. 

2. Among the characteristics of executive 
performance and leadership in business are 
membership in social and professional organiza- 
tions and memberships paid by the company, 
the number of all of these variables increasing 
as the executive advances in echelon level. 

The question of whether the variables 
studied were a function of the job or the indi- 


vidual in the job can be answered for business 
leadership in general only with horizontal 
studies of many executives in a wide variety 
of companies or with longitudinal studies over 
an extended period of time with specific com- 
panies or with a given group of executives. 

For the present, however, the discussion 
presented here may prove to be of value in the 
selection and training of business executives 
since it suggests that (1) working relationships 
at higher echelons and in certain departments 
may extend beyond the confines of the office 
and the factory; (2) the individual who is 
aspiring to higher echelons or to certain de- 
partments may need to accept the probability 
that his executive duties for the company will 
extend to his social life; and (3) at least part of 
an executive's social life will be planned for the 
benefit which it will yield to the company and 
his own success in it. 


Received May 17, 1949, 


Readability and Interest Values in an Employee Handbook * 


James N. Farr 


University of Minnesota 


Adequate communication between manage- 
ment and workers is essential to the success of 
modern personnel programs. The basis of 
such programs is an informed work force; a 
work force informed about company policies, 
plans, and competitive position; a work force 
with information as to the rules governing the 
organization of which they are a part, and an 
explanation of the “why” of these rules. 

The means developed for this sharing of in- 
formation include employee handbooks, house 
organs, company newspapers, bulletin boards, 
etc. 'These media are important, and much 
time and money are spent in their production. 
There is a growing literature dealing with the 
functions of these media, and the problems 
peculiar to each. There are available detailed 
discussions of how to prepare such media for 
publication, dealing with every step from for- 
mulation of editorial policy to distribution of 
the finished product. 

Such discussions, adequate though they 
are for the problems discussed, do not go far 
enough. Distribution of the finished hand- 
book or house organ is not the final step of the 
process. The process is not complete until 
the employees read what has been written 
and understand what they read! 

Two aspects of the publication are im- 
portant in this final step of the process; the 
attention and interest value of the publication, 
and the readability of the material in the 
publication. The fact that employee publi- 
cations generally contain pictures and illustra- 
tions is an indication that the attention and 
interest problem has been recognized and is 
being met in part,—but only in part. It can 
never be wholly met so long as the second 
aspect—readability—is neglected, for a reader 
will not maintain interest in material that is 
difficult for him to understand. 

* For an example of an excell 


the reader is referred to a handbook 

Airlines employees entitled “Hello! Pa tee 

Was written for Northwest Airlines, Inc. by Mr. Wendell 
Bowles, and was the source for a number of the ideas 

used in the handbook discussed in this article. 


ent employee handbook 


16 


Review of Literature 


That the problem of readability is being 
neglected is evident. It is scarcely touched 
upon in the literature dealing with employee 
publications. Heron (5) who gives us an ex- 
cellent discussion of the purposes, uses, and 
problems of communications media tells us 
only that the material must be readable and 
understandable. The National Foremen's in- 
stitute (6) in its manual on how to prepare an 
employee handbook discusses in great detail 
the content and technical problems to be met 
in the publication. The problem of reada- 
bility, however, is not discussed except for 4 
statement that “two-dollar” words should be 
avoided, and that the writer should strive for 
simplicity. Biklen and Breth (2) omit dis 
cussion of the readability problem 1n their 
excellent book on employee publications. 
Bentley (1) in his book which deals in great 
detail with problems of editing an employee 
publication, dismisses the readability problem 
with the recommendation that newspaper 
style writing be used. This presumably 
means simplicity, familiar words, short sen- 
tences, and not too many adjectives and 
adverbs. 5 

In view of the fact that the achievement 
of readable material in employee publications 
is essential if it is to have any value at all, the 
problem deserves much more careful considera- 
tion than this. What good is done by 2 
publication which is perfect in its technical 
aspects and content, but which cannot be rea 
with understanding by a considerable portion 
of the intended audience? : 

Rarely is any information given as to ho" 
to achieve readability. Apparently it is 25" 
sumed that the writer of a handbook or house 
organ need only be aware of the fact that his 
material should be readable and that he wil 
then see to it that it is so. Nothing is further 
from the truth. Seeing to it that a handbook 
or house organ is readable and will be read 1°- 
quires just as much effort and thought 3° 


Readability and Interest Values in an Employee Handbook 17 


seeing to it that it includes the correct content 
and expresses the desired point of view. This 
fact is often overlooked, or not realized at all. 
Any personnel man who writes a handbook 
knows intuitively that he writes in a readable 
Style. After all, it seems perfectly clear to 
him, and his colleagues all seem to understand 
what he has written. This is true, but the 
fact of the matter is that the handbook is not 
being written for his colleagues, but for the 
main body of the firm's employees. These 
people are not college graduates—a good many 
of them are not even eighth grade graduates— 
and material written by a college graduate, 
and understood by his colleagues, is not so 
readily understood by these workers. In fact, 
it may seem so difficult to them that they do 
not even attempt to read it. 

This fact, that information published for 
Workers must be written in language they can 
understand, if it is to hold their interest, seems 
to be self evident. Few would argue that this 
is not necessary. Most, in fact, would argue 
that they consider this factor carefully when 
publishing a handbook, for instance. This is 
no doubt true. Every effort may be made to 
make the handbook readable,—but for whom? 
The difficulty here seems to be in overlooking 
the fact that what is readable for one group of 
people is not necessarily readable for another 
group. The problem is to see that the hand- 
book is readable for the intended reader, not 
for the writer. 

This has been a difficult problem. It is 
easy to look at a piece of writing and tell 
Whether it is hard to read or easy to read, in 
terms of your own scale of difficulty. It is 
much harder to judge the difficulty level in 
terms of the reading ability of someone else, 
Such as a workman who has had an eighth 
grade education. Fortunately we now have 
Several objective methods of estimating the 
reading difficulty level of written material. 
Application of the readability formulae devel- 
oped by Flesch (3, 4, 4a), for example, gives a 
good estimate of reading difficulty, and of the 
educational level of those who will be able to 
read the material with understanding. 

If the application of the Flesch formulae 
indicates that the material is written at a 
difficulty level too high for the intended audi- 
ence, the chief problem then arises: how to 


write at a level below that on which you are 
accustomed to write without seeming to “write 
down." Flesch (3, 4a) has made this problem 
much easier to solve. By following the rules 
he has laid down it is possible with practice to 
produce material which is both readable and 
well written. 

Application of the Flesch formulae to man- 
agement communications indicates that very 
often the material is written on a difficulty level 
far beyond the comprehension of the group to 
whom it is directed. Paterson and Jenkins 
(7) in their analysis of an information sheet 
for potential applicants in one large firm found 
that it was far too difficult for that group. The 
writer found in an analysis of 25 company 
house organs, and 25 union papers, that they 
were written at a level that effectively shut 
out a large portion of the intended readers. 

This is obviously a ridiculous situation. It 
is futile to expend large amounts of time, 
energy, and money to publish a handbook or 
house organ which will not, or cannot be read 
with understanding by many of those for whom 
it is prepared. 


Proposed Employee Handbook 


Recently the writer was asked to analyze 
and revise a proposed employee handbook for 
a textile firm. The results of the analysis, 
and the steps taken in the revision to ensure a 
readable handbook will be discussed in the 
following pages. 

Before beginning the revision, Flesch's 
readability formulae were applied to samples 
drawn from the proposed handbook. This 
sampling indicated that the difficulty level of 
the writing was far too high for the group of 
employees to whom it was to be given. 

The first step, therefore, was to rewrite the 
handbook in much simpler language. This 
was done by following the rules given by Flesch 
(3, 4a). Long sentences were broken up into 
Short simple sentences; words of one or two 
syllables were substituted for longer words 
wherever possible; statements were addressed 
directly to the reader wherever possible; and 
the material was personalized by the use of 
personal pronouns. 

This resulted in a handbook which was 
much easier to read, but which was far too 


18 James N. Farr 


long. It was now in a form which the workers 
could read but not in a form which they likely 
would read. For even though a handbook is 
written at a level the worker can understand, 
it will not hold his interest if it consists of page 
after page of words. It must be remembered 
that though the facts presented in a handbook 
are of value to a worker, they may not seem 
to him to be of immediate value. He will 
take them if they are easy to come by, but he 
will not go out of his way to sift these facts 
from paragraphs of useless words. 

For this reason, a handbook should contain 
as few words as you can get by with rather 
than all that you can crowd into it. Get rid 
of words wherever possible. Put in a picture 
instead. Or leave white space. If nothing 
else, this will make the important facts and 
ideas stand out so that some of them get across 
even in a cursory reading. 

The second step, then, was to strip away all 
excess words which served only to hide the 
ideas that were to be presented. The result of 
this step was to reduce the number of words 
from approximately 10,400 to approximately 
2,800, and the number of pages from 39 to 28. 

The revision now contained 28 pages with 
an average of 100 words per page. This meant 
there was ample white space, and ample room 
for pictures or illustrations. In the fourth 
step of the revision, illustrations were added. 

The illustrations used were simple line 
figures. They serve to catch attention and to 
emphasize the rules or ideas being discussed. 
They also offer a painless method of “laying 
down the law.” Consider the example in 
Figure 1. 


HE WAS LATE 

SO OFTEN HE 

PASSED ON OWE 
DAY. 


Fig. 1. Example of line drawing to illustrate the 
company rule regarding excessive tardiness. 


This is much less harsh than a statement to 
the effect that "if you are late too often, 
disciplinary action will be taken.” 

The revision was now complete. The 
handbook had been rewritten in simple lan- 
guage, excess words had been discarded, leaving 
the principal ideas emphasized on uncrowded 
pages, and simple illustrations had been added. 

Below is reproduced a page from the 
original handbook, followed by the page from 
the revised handbook which dealt with the 
same subject matter. 


Incentive Pay . 

The Company has installed. wherever possible 
an incentive or piece work plan as a means of giving 
additional pay to those producing more, than the 
average for a normal day's work. It is important 
that you, as an employee of the company» be fully 
familiar with the operation of this plan. 

A trained time study person observes the operit- 
tion being performed by an operator. The job is 
broken down into the work elements of which it is 
composed and a time study person is trained to 
recognize unusually good or below average effort 
and ability. If the time required to perform an 
operation is less than normal expectancy, the em- 
ployee benefits from the extra effort. If an em- 
ployee is working slower than normal, the reverse 
would be true. Provision is made for normal delays 
under personal and machine allowances. 

Time standards are guaranteed against change 
as long as a. method of operation and conditions 
surrounding the operation remain unchanged. How- 
ever, in those instances where either the operator 
or the company feels that a rate is too low OF too 
high, a re-timing may be requested and a change !n 
rate negotiated. Any complaint on reves shoul 
be brought to the attention of your department 
supervisor. Time studies will be made in all ine 
stances where the department supervisor am t E 
person in charge of time study activities feel at a 
re-timing should be made. : : 

The management encourages any questions or 
suggestions concerning the operation of our Heer. 
tive program. We are always ready to accep 
constructive comments on this subject because Ie 
the final analysis it will be mutually advantageous: 


Compare the two sample pages shown 
Can there be any doubt as to which can : 
best understood by the worker? Or which ' 
the more likely to be read with attention? 

Consider the contents of the two P3875; 
The first contains 278 words. The revs?" 
page contains 100 words. A glance will show 
that the difficulty level of the words used "7 
the revised page is far below that of the first 
sample. It is far easier to read the revis? 


Readability and Interest Values in an Employee Handbook 19 


page with its abundance of white space. And 
the drawing and “Reward” box, simple as 
they are, are attention catching features. 

It may be argued that the revised page has 
omitted some of the information contained in 
the original. That is true. The essential in- 
formation is there, however. It must be borne 
in mind when comparing these two pages that 
a large number of the employees who were to 
receive this handbook had only an eighth 
grade education. The writer believes that it 
is better to present the essential idea in a way 
that will get it across than to present a large 
amount of information which may not even 
be read. 

A comparative analysis of the original and 
revised handbooks was made, using Flesch’s 
new Readability Formulae (4). Thirteen 100- 


word samples were drawn from the original, 
and fourteen 100-word samples were drawn 
from the revised handbook. The results of 
the application of the formulae to these samples 
are shown in Tables 1 and 2. 

The average Readability Score for the 
original handbook was 59.2, a difficulty level 
requiring some high school for adequate under- 
standing. The scores ranged from “Difficult” 
to “Fairly Easy” with 8 of the 13 scores “Fairly 
Difficult” or harder. This means that over 
half of the samples drawn from the original 
handbook required “some high school" or more 
education for adequate understanding. In 
view of the fact that a large number of the em- 
ployees who were to receive the handbook had 
eighth grade or less education, it is evident 
that its effectiveness would have been limited. 


[ 


Reward! 


For 
those who 
do better 
than 
average 
work 


of work. 


Reward! 


Here's howzit works. 


Then the piece-work rate is set so that 
if you are an average worker you earn a 
fair day's pay. 


If you are better than the average 


and take less than the average time to 
do the job—you earn extra pay. 


YES SIR, EXTRA 
EFFORT PAYS OFF IN 
EXTRA DOLLARS | 
D. RENE. 


Our piece-work plan gives you more 
pay for “better than average" amount 


A time-study man studies your job. He 
* finds out how long the average worker 
takes to do it. 


Fig.2. Sample page from revised handbook. 


20 


James N. Farr 


Table 1 


Flesch Readability Scores and Interpretation for Samples Drawn From Original and Revised Handbooks 


paar — School Grade peel 
ility Orig- Re- Typical of Potentia 
oo inal vised Style N Audience 7 
0 to 30 i 7 Very difficult Scientific College 
30 to 50 3 Difficult Academic H.S. or some college 
50 to 60 5 Fairly difficult Quality Some ILS. 
60 to 70 2 Standard Digests 7th or 8th grade 
70 to 80 3 2 Fairly easy Slicks fiction 6th grade 
80 to 90 7 Easy Pulp —fiction Sth grade 
90 to 100 5 Very casy Comics 4th grade 
Average 59.2 87.7 
Table 2 
Flesch Human Interest Scores and Interpretation for Samples Drawn From Original and Revised Sa 
Tabulation 
Human —— — m. iss 
t Orig- Re- Description Typica 
E ins] vised of Style Magazine 
0 to 10 ME Dull Scientific 
10 to 20 2 Mildly interesting Trade 
20 to 40 3 2 Interesting Digests T" 
40 to 60 5 4 Highly interesting New Yorke 
60 to 100 3 8 Dramatic l'iction 
Average 45.9 39.3 dm 
ae P " ; and- 
The average Readability Score for the re- written media such as house organs P ir- 
vised handbook was 87.7, a difficulty level books, etc. Discussions of the content, 


requiring only a fifth grade education for 
adequate understanding. The scores for the 
samples ranged from “Fairly Easy” to “Very 
Easy.” Very few employees would be unable 
to understand material written at this level. 

The Human Interest scores were also 
higher for the revised handbook, as shown in 
Table 2. This means that the revised hand- 
book contained more personal pronouns and 
personal sentences than did the original. This, 


according to Flesch, makes the material more 
interesting to read, 


Summary 


1. Recognition of the importance of com- 
munication between management and workers 
has resulted in an increase in the use of 


poses, and uses of these media are availabl 
in the literature. Discussions of the techn! 
problems of publication are also available. in 
2. The important problem of readability a 
these written communications has not the 
dealt with, however. There is little 1” the 
communications literature dealing with un- 
need to write at the level the worker can 
derstand, and little dealing with the pro 
of how to write readable material. ow 
3. The Readability Formulae of Flesch "the 
provide an objective method of evaluating he 
difficulty level of written material, an 
rules laid down by Flesch point the Wa 
writing readable copy. ylae 
4. Application of the Readability — én? 
to management-employee communication” 
dicates that they are frequently written 


Readability and Interest Values in an Employee Handbook 21 


difficulty level far too high for the intended 
audience. 

5. Application of the formulae to a proposed 
employee handbook classified it as “Fairly 
Difficult,” requiring “some high school” read- 
ing ability foradequate comprehension. Many 
of the workers who were to read the handbook, 
however, had only eighth grade or less educa- 
tion. The handbook was rewritten, using 
Short sentences, short common words, and 
more personal words. Excess words were dis- 
carded, leaving more white Space per page, 
thus making it easier to read. Flesch analysis 
classified the material as "Easy," requiring 
fifth grade reading ability for understanding. 


Received A pril 16, 1949. 


References 


l. Bentley, G. How to edit an employee publication. 
New York: Harper & Brothers Publishers, 1944. 

2. Biklen, P. F., and Breth, D. B. Successful employee 
publication. New York: McGraw-Hill Book 
Company, Inc., 1945. 

3. Flesch, R. The art of plain talk. 
Harper and Brothers, 1946. 

4. Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 

4a. Flesch, R. The art of readable writing. New York: 
Harper and Brothers, 1949, 

5. Heron, A. R. Sharing information with employees. 
Stanford: Stanford University Press, 1942, 

6. American Management Association. How to prepare 
and publish an employee manual. New York, 
1946. 

7. Paterson, D. G., and Jenkins, J. J. Communica- 
tion between management and workers. J. appl. 
Psychol., 1948, 32, 71-80. 


New York: 


Reliability of the Flesch 


Readability Formulas 


Patricia M. Hayes, James J. Jenkins, and Bradley J. Walker $ 


Department of Psychology, 


A formula developed by Flesch (3) for esti- 
mating the comprehension difficulty of written 
material has received widespread attention in 
many areas of communication. It has been 
applied in the fields of journalism (7), adver- 
tising (1), industrial communications (5), gov- 
ernment publications (8) and many others. 

In view of the wide use of his formula, Flesch 
(4) published a revision in 1948 designed to in- 
crease its utility and make its application and 
interpretation easier. This revision proposes 
the use of two formulas to measure two rela- 
tively independent aspects of readability. 
The first formula involves word length (wl) and 
and sentence length (sl) and gives a measure 
of “reading ease” (RE). The second formula, 
based on personal words (pw) and personal 
sentences (fs), yields a measure of “human 
interest” (HI). 

As these formulas become increasingly 
popular, they must, of course, be evaluated 
critically. Like other psychological tools, they 
must be tested for validity and reliability. 
Flesch (4) reports several studies of the validity 
of the original formula which indicate that 
material rated more readable by the formula 
also proves more readable in terms of reader- 
ship surveys and opinions of judges. As yet, 
however, no studies of the reliability of the 
formulas as applied by different analysts have 
been reported. 

The following studies were designed as first 
steps in the examination of analyst-to-analyst 
reliability of the formulas to determine the 
extent to which they are effectively objective. 


First Study 


The material chosen for analysis in the first 
study consisted of the 40 prize-winning letters 
in the recent General Motors? “Why I Like 

» 
. My Job" contest (9). These letters were 
selected because they presented a wide range 


* Hayes and Jenkins are 
first study in this paper 
sponsible for the second. 


primarily responsible f. 
and Walker is primarily rà 


University of Minnesola 


of difficulty, style, structure and content. It 
was believed the letters would afford a maxi- 
mum number of problems in interpretation and 
would provide a rigid test of objectivity. 

Two sets of samples were drawn from the 
40 letters. Each set consisted of two 100- 
word samples from each letter. Since the 
letters ranged in length from about 350 to 3000 
words with a median length of 750 words, there 
was little overlapping between the sets of 
samples. 

Two experienced and two inexperienced 
analysts participated in the study.! The expe- 
rienced workers analyzed both sets of samples; 
the inexperienced each analyzed one set. The 
experienced analysts had worked with the 
original formula and the revised formulas for 
a year and a half. The inexperienced analysts 
had never worked with the formulas before: 
The analysts made no attempt to agree oF 
interpretation of Flesch’s instructions and m 
frained from discussing interpretation wit 
anyone else. 

Reading ease and human interest 90 
were computed from tables (2) in an efo" y 
minimize computational errors. Resu p on 
analyses made by different investigator“ y 
the same set of samples were compare" g- 
determining: the significance of differences 
tween means of the four variables (wl, 5^ Í 
ps) and the two scores (HI, RE) for the 56. - 
samples as a whole; the extent of correlate 
between results of analysts; the number M 
degree of differences in actual scores fof 9^, 
sample; and the number of differences 1” 
scriptive categories assigned to each samP'^ 


ores 


Results of First Study 


qs "EON 
The assumption of wide variability ?? pe 


materia] used was confirmed. AS rot 88 
Seen in Table 1, samples ranged from 6 nd 
nts ? 


on the reading ease scale of 0 to 100 poi 


istana 
* The writers would like to acknowledge the assist ag 


of Barbara Lee : is part 
project. and James Farr in this P' 


Reliability of the Flesch Readability Formulas 23 


Table 1 


Means, Standard Deviations and Ranges for the Four 
Variables and Two Scores of the Flesch Reada-- 
bility Formulas for Each Analyst 


Ana- 

Mean lyst l sl RE pw ps HI 

Setl A 1450 237 602 9.9 153 41.1 
B 1452 233 604 9.7 102 382 
C 1450 239 399 95 136 30.5 

Set 2 B 144.0 22.7 621 9.5 11.8 38.6 
( 143.8 22.5 62.3 102 142 122 
D 1443 232 612 9.5 131 393 

Standard 

deviation 

Set 1 A MO ALL 159 35 266 182 
B 11.8 113 161 3.3 176 164 
G 11.7. 111 15.4 3.7 18.4 15.6 

Set2 B 94 108 13.6 29 214 143 
Cc 9.0 9.8 129 2.8 267 13.6 
D 9:7 99 134 3.4 21.1 14.0 

Range 

Set 1 A 124-174 11-72 7-85 4-20 0-100 15-100 
B 125-174 10-72 6-84 4-20 0- 83 13- 97 
C 124-173 11-72 6-82 5-18 0- 91 18- 94 

Set 2. B 122-164 12-80 42-81 3-15 0-100 16- 79 


C 122-162 
D 123-164 


42-88 5-17 0-100 18- 80 
12-74 42-86 5-15 0-100 16- 80 


from 13 to 100 on the human interest scale of 
0 to 100 points. 

The means of the four variables and two 
Scores obtained by analysts on the same sets 
of samples were tested for significant differences 
by use of the critical ratio corrected for correla- 
tion. None of the differences between any 
analysts in either sample set proved to be 
Significant at the five per cent level. 

Rank difference correlations were computed 
between each pair of analysts within each 
Sample set on the rank given each letter. 
These correlation coefficients are presented in 
Table 2. All of the correlations are positive 
and significantly different from zero beyond 
the one per cent level. 

An inspection of Table 2 indicates that 
analysts were in good agreement in interpretin E 
the components and final score for reading 
ease, For the human interest variables, how- 
ever, there was much less agreement between 
analysts. Personal words were apparently in- 


terpreted much the same by analysts, but it is 
evident there were diverse interpretations of 
personal sentences. This acts, of course, to 
lower the correlations of the human interest 
Score. 

It should be noted that correlations be- 
tween experienced analysts (B and C) are not 
appreciably different from those with inexperi- 
enced analysts (A and D). 

Since neither of the statistical methods 
presented above reveals the actual differences 
between analysts for a given sample, a third 
kind of comparison was made. Within each 
sample set all analysts were compared on 
reading ease and human interest scores for 
each letter. Actual point differences for each 
pair of analysts were tabulated. Results of 
the 240 comparisons are shown in Table 3. 

This table also suggests that there is greater 
agreement on reading ease (90 per cent of the 
comparisons within four points of each other) 
than on human interest (90 per cent of the 
comparisons within eight points of each other). 
On the 100-point scale designed to be used as 
an estimating device, deviations as small as 
these do not appear to be of great importance. 

Again it should be mentioned that no con- 
sistent difference was found in the number or 
extent of deviations between scores of experi- 


Table 2 


Rank Order Correlation Coefficients between Pairs of 
Analysts for the Variables and Scores of the 
Flesch Readability Formulas * 


Sample Set 1 Sample Set 2 
Ana- Aand Aand Band Cand Band Band 
lysts** B C c D D C 
wl 99 99  .99 99  ,99  .99 
sl 94 97 94 83 87 94 
RE .98 .99 .98 93 95 97 
pw .93 .94 96 92 99 93 
ps .89 .69 74 .63 87 60 
HI .88 OL 96 78 97 .80 


* These correlations should be interpreted with cau- 
tion since the data are markedly skewed in the case of 
bs and HI. It should be noted, however, that the 
order of relative accuracy for the scores is the same 
whether correlations, point differences or category 
differences are considered. 

** Analysts A and D are inexperienced; B and +A 
experienced. 


24 


enced analysts compared to their deviations 
with scores of inexperienced analysts. 

A final comparison was made between ana- 
lysts in terms of descriptive categories in which 
results are often reported and utilized. Flesch 
(4) divides the reading ease range into seven 
levels varying from “very difficult” to “very 
easy" and the human interest range into five 
levels varying from “dull” to “dramatic.” 


Table 3 


Differences in Score Points Between Analysts 
on Identical Samples * 


Reading Ease Human Interest 


; PerCent Cumu- PerCent Cumu- 
Differ- of lative of lative 
ence in Com- Percent- Com- Percent- 
Points parisons age parisons age 

1 49.2 49.2 33.3 33.3 
2 25.8 75.0 18.0 51.3 
3 9.2 84.2 6.6 57.9 
4 6.2 90.4 15.4 73.3 
5 3.8 94.2 4.2 71.5 
6 1.6 95.8 5.4 82.9 
7 17 97.5 3.4 86.3 
8 E 98.3 3.7 90.0 
9 9 99.2 25 92.5 

10 8 100.0 8 93.3 

11 or more 6.7 100.0 


* Based on 240 comparisons; three analyses fi 
of 80 samples of 200 ode d PENES 


Of the 240 comparisons, in only 14 cases 
(5.8 per cent) did the analysts differ in the 
category assigned to reading ease. In 28 cases 
(11.7 per cent) they differed in the category 
assigned to human interest. Of these cases of 
disagreement, only two (.8 per cent) were 
greater than one category for reading ease, and 
only four (1.7 per cent) were greater than one 
category for human interest. - 


Discussion 


"E T results of this first study seem to 
cd a imited. but fairly clear answer to 
2n a 2 of reliability of the Flesch for- 
" "a lon of the greatest sources of error 

y be of some value in interpreting the data 


and may provide i 
afew : 
Neha therfore, w hints to those who wish 


Patricia M. Hayes, James J. Jenkins, and Bradley J. Walker 


The greatest discrepancies obviously appear 
in interpretation of personal sentences. A 
study of Table 2 shows that correlations for 
this variable are especially low when analyst 
“C” js involved. Samples which contributed 
most to the discrepancy between “C” and the 
other analysts were studied. Over half of the 
major differences between “C” and the others 
involved one type of personal sentence defined 
by Flesch (4) as “grammatically incomplete 
sentences whose full meaning has to be inferred 
from the context." The examples given by 
Flesch (4) appear to be taken from conversa- 
tions, and apparently the definition was re- 
garded as limited to conversations by analysts 
"4 He ond “Dp.” It would seem, however, 
that the examples are to this extent misleading 
since conversational sentences are already 
covered by Flesch's first definition regarding 
spoken sentences. 
analysts study the definitions carefully and 
that Flesch provide more varied examples. 

A second source of disagreement involv 
rhetorical questions. Analyst "C" did not 
count these as personal sentences, but it 
appears clear from Flesch’s definition that 
these should have been considered and score’ 
as the other analysts scored them. " 

If these two sources of error (incomplet 
sentences and rhetorical questions) had 
corrected, correlations for personal sent 
would have been raised above .90, an“ 
human interest scores would have zappear™ 
much more reliable. jd 

Errors in personal words were few dms 
appear to be due largely to carelessn?- 
While there were no consistent errors: 
time to time one analyst or another ten¢ a ” 
regard a common-gender noun like “yore” 
or “manager” as a personal word. 
definitions and examples are explicit 9? 
point (4). 

Errors in sentence length resulte¢ à 
from disregarding directions on counting o jdle 
sentence when the sample ends in the makr 
of a sentence and from disagreement 0”. "pay 
ing sentences into units of thought. , 
be noted that the lowest correlations n. » A 


ed 


E 


.hiellY 
1 chi et 


2 for sentence length involve analyst ~ "test 
study of the samples yielding the Ec pad 
discrepancies revealed that if analyst ~ just 


broken sentences into units of thought ! 


It might be suggested that, 


Reliability of the Flesch Readability Formulas 25 


two instances, correlations would have been 
above .95. Here it appears that more careful 
attention to directions would have assured 
high reliability. 

Errors in word length are all very small and 
appear to reflect minor clerical errors. 


Second Study 


A second study was conducted to test our 
findings with a large number of inexperienced 
analysts. Samples of 500 words from 63 house 
organs and employee publications which were 
being examined in connection with a continuing 
study of industrial communication (6) were 
assigned for analysis to 18 members of a gradu- 
ate seminar in psychology. Each student 
analyzed seven publications which were sub- 
sequently reanalyzed by another member of 
the seminar. Assignments were anonymous 
and cooperation between students was dis- 
couraged. Only three of the students had 
appreciable experience with the formulas prior 
to the time of the study. 


Table 4 


Sampling Statistics of Test and Re-Test Distributions 
for the Second Study 


Standard 


Means Deviation Range 
Re- Re- Re- 
Test Test Test Test Test Test 
wl 1554 154.9 7.08 7.98 138-172 140-167 
sl 20.6 20.5 4.39 4.15 15-45 1345 
RE 545 55.2 8.40 7.71 30-73 31-69 
pw 7.3 66 2.38 224 3-16 2-13 
bs 12.8 13.1] 11.17 11.45 0-46 0-48 
HI 30.3 28.3 9.56 15-64 8-62 


10.74 


The results of the analyses provided pairs 
of scores for each of the publications. The 
first analyses were compared with the second 
analyses to determine the reliability of the 
application of the formulas to the same 
Samples. Product moment correlations be- 


tween the “test” and “retest” analyses are as. 


follows: wl, .90; sl, .92; RE, .91; pw, .78; ps, .64; 
and HI, .81. All coefficients were positive and 
Significantly different from zero. . 
The data for the means, standard devia- 
tions and ranges are presented in Table 4. A 
Comparison of the standard deviations and the 


ranges in Table 4 with those in Table 1 reveals 
that the material used in the second study was 
appreciably more homogeneous than that used 
in the first. The correlations found in the 
second study, then, might be expected to be 
smaller than those of the first study. Ac- 
cordingly, the correlations presented immedi- 
ately above were "corrected" by estimating 
their magnitude on the basis of the more 
heterogeneous material of the first study. 
This “correction” gave the following coeffi- 
cients: wl, .95; sl, .99; RE, .98; pw, .88; ps, .85; 
and HI, .92. 

These correlations approximate those found 
in the first study and would lead one to the 
same conclusions. Reading ease with its com- 
ponents is analyzed quite reliably and human 
interest with its components is analyzed with 
less, though still fair, reliability. Analysis of 
personal sentences again shows the greatest 
lack of agreement between analysts. 

The difference in points between “test” and 
“retest” analyses agrees rather closely with the 
data from the first study given in Table 3. 
Ninety per cent of the paired scores for reading 
ease were within six points of each other and 
approximately ninety per cent of the paired 
scores for human interest were within eight 
points of each other. 


Summary and Conclusions 


An examination of analyst-to-analyst re- 
liability of the Flesch readability formulas was 
conducted. In the first study two sets of 
samples were drawn from reading material of 
a highly variable nature believed to involve a 
large number of problems of interpretation. 
The sets of samples were analyzed by two 
inexperienced and two experienced analysts. 
Results of analysts for each set of samples were 
compared by testing the significance of mean 
differences on the variables and the scores, 
correlating results on the variables and scores, 
tabulating deviations in terms of score points 
and tabulating disagreements in descriptive 
categories. 

In the second study, eighteen students 
analyzed samples of 500 words from 63 indus- 
trial house organs. Each sample was inde- 
pendently analyzed by two analysts. Correla- 
tions between the first and second sets of 


26 


analyses were computed and then corrected for 
restriction of range. Deviations in terms of 
score points were computed. 

From the above data the following con- 
clusions seem justified: 

1. Analyst-to-analyst reliability on word 
length, sentence length, and reading ease is 
quite high for the kinds of material used in 
this study. 

2. Analyst-to-analyst reliability on personal 
words is fair, but on personal sentences (and as 
a result on human interest) is lower than might 
ordinarily be considered desirable. 

3. For practical purposes the Flesch for- 
mulas and the directions for their use are 
sufficiently objective to be used even by in- 
experienced analysts to obtain estimates of the 
reading ease and human interest of written 
material. 


Received June 8, 1949. 


Patricia M. Hayes, James J. Jenkins, and Bradley J. Walker 


~x 


. Farr, James N., and Jenkins, James J. Tables for 


References 


. Alden, J. 
words. 


Lots of names—short sentences—simple 
Printer’s Ink, June 29, 1945, 21-22. 


use with the Flesch readability formulas. J- 
appl. Psychol., 1949, 33, 275-278. 

. Flesch, R. The art of plain talk. New York: 
Harper and Brothers, 1946. 
Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 
Paterson, D. G., and Jenkins, James J. Communi- 
cation between management and workers. J. 
appl. Psychol., 1948, 32, 71-80. 

. Paterson, D. G., and Walker, B. J. Readability 
and human interest of house organs. Personnel, 
1949, 25, 438-11. 

. Swanson, C. E. Readability and readership: a con- 
trolled experiment. Journ. Quart., 1948, 25, 
339-343. 

. How does your writing read? U. 
Commission. Washington: U. 
Printing Office, 1946. 

. The worker speaks. General Motors, 1947. 


S. Civil Service 
S. Government 


The MacQuarrie Test for Mechanical Ability. 
IV. Time and Motion Analysis * 


Charles H. Goodman 


Radio Corporation of America 


This is the fourth! and last article on the 
use of the MacQuarrie Mechanical Ability Test 
in a radio manufacturing company. This 
study presents the results of a time and motion 
analysis of the test. The purpose of this study 
was to determine whether a time and motion 
analysis could provide insight into the number 
of factors being measured by the test, and 
whether this type of analysis could assist in the 
interpretation of the factors in the test. 

Of the seven MacQuarrie sub-tests, only 
four—Tracing, Tapping, Dotting, and Copying 
—lend themselves to time and motion analysis. 
The reason is that only these four sub-tests 
involve manual movements and could there- 
fore be analyzed by this technique. The par- 
ticular method of time and motion analysis 
used in this study was that of synthesis? which 
is defined as a method of determining the select 
time for a given motion pattern by the applica- 
tion of standard moving time values to a de- 
tailed motion analysis. 


* Procedure 


Upon completion of the time and motion 
analysis, it was possible to construct Table 1 
which shows the various time and motion 
elements found to be operating in the four sub- 
tests. This analysis also provided the data 
whereby one could determine which elements 
were common among the tests and the fre- 
quency with which these elements occurred 
while carrying out the tasks. 

* The writer wishes to thank Mr. Fred Weber, Time 
Study Engineer, RCA Victor Corporation, for his 
assistance in conducting the time and motion analysis 
for this study. 


! Goodman, Charles 
mechanical ability. I. 


H. The MacQuarrie test for 
Selecting radio assembly oper- 
ators. J, appl. Psychol., 1946, 30, 586-595; IL Factor 
analysis, J. appl. Psychol, 1947, 2, 150-154; III. 
Follow-up study. J. appl. Psychol., 1947, 5, 502-510. 

2 Synthesis manual. Radio Corporation of America, 
RCA Victor Division, Camden, N. J., 1944. 


w 
— 


Results 


Overlapping r’s were computed on the basis 
of the data recorded in Table 1. The formula 
used to compute the overlapping r's was: 


nc 


á Vratno) (nb+ ne) 


The object of computing these overlapping 
r’s was to determine whether the relationships 
among these four tests as based upon the time- 
study elements, would approximate the rela- 
tionship among the same tests as shown by the 
Pearsonian intercorrelations. Table 2 shows 
the overlapping 7's and their Pearsonian coun- 
terparts. Four of the six overlapping 7’s deviate 
from their Pearsonian 7’s from .00 to .09. The 
overlapping r for the Tracing and Tapping Test 
deviates .21 from its corresponding Pearson r. 
The sixth and largest deviation of .23 involves 
the Tapping Test with Copying. The writer 
is unable on the basis of available data to ex- 
plain why these two deviations are larger than 
the other four. 

While it appears that there is considerable 
agreement between the Pearsonian r's and the 
overlapping 7’s, one cannot lose sight of the 
fact that in using the formula for computing 
the overlapping 7’s one does not know if the 
factors operate independently and additively, 
or in a dependent and related manner. While 
this finding is of considerable interest, further 
study would certainly be needed before one 
could attach any significance to it. 

Since it was found that considerable agree- 
ment existed between the Pearsonian 7’s and 
the overlapping 7's, the question was raised as 
to whether or not time and motion analysis 
would be helpful in interpreting the factors 
being measured by the MacQuarrie test. 

An earlier study? had been made by the 


3 Goodman, C. H. Op. cit. 


28 


Charles H. Goodman 


Table 1 


Time Study Elements Operating in the Tasks Required by the MacQuarric Tests 


a BS 
E SE 
v RS 
E s 5 g% 3 
z © m am mo = z 
E Š 3 z, £$ #2 g Æ 
a m 2 Eg EA FE K ie 
z = — a zo Se Sa & S 
E $ 5 37 E sb g 85 g 3 
z E eed E =e e b &S toc I] zi 
E & g Z 3 2 6 ES EE a 2 
& E A $ A E Be fe ED 5] B 
5 A , — 
- xe x 2x x 
Tracing x z SE * x x 
: r x E 
in x . 
Tapping " 4x 2x x x x x 
Dotting 2x » ü j " = E 
Copying x 8x 2x $ m — 


writer in an endeavor to identify the factors in 
the MacQuarrie test by means of factor analy- 
sis. This factorial study showed three factors 
to be operating in the MacQuarrie test. The 
test loadings of the first factor ranged from 
.369 to .000. While none of the test loadings 
for this factor were statistically significant! it 
appeared to the writer that some factor was 
operating. The sub-tests of Tracing, Dotting, 
and Pursuit had loadings of .369, .338 and .325 
respectively, while the remaining four sub- 
tests were zero or slightly above zero. While 
the writer was unable to identify the factor 
on the basis of the factor analysis data, time 
and motion analysis data clearly indicated to 
the writer that it was a “visual inspection” 
factor, since the time and motion analysis 
showed that the three largest test loadings on 
this first factor called for considerable amounts 
of visual inspection before the tasks could be 
done. On the other hand, little, if any, visual 
inspection was called for by the remaining four 
sub-tests. From a statistical point of view this 
factor was named only tentatively since none 
of the loadings met the criteria of being as 
large as .40. On the other hand, based on the 
time study analysis, the writer had no hesitancy 
m so naming the factor. It is further the 
opinion of the writer that he could not have 
readily identified this factor without the time 
study data. 


The test loadings of the second factor in the 
factorial analysis were Statistically significant 


* Thurstone, L. L. Primary ental abiliti 
cago: University of Cliicaeo Pa mental abilities, 


Chi- 
ress, 1938. us 


and this factor was named a "spatial" factor. 
Corroborating evidence that this factor was . 
"spatial" factor comes from the work of wn 
stone? and Harrell,® in that their studies "s 
showed the MacQuarrie to be loaded with 4 
“spatial” factor. . 
The time and motion analysis, t 
of the writer, did not in this instance 
any assistance in identifying this hat the 
factor. The reason for this may be tha pre- 
space factor is psychological in content, Bge 
sumably involves mental processes and otion 
fore escapes the detection of time and able to 
studies, a technique which is only applica 
observable acts. g py the 
In naming the third factor disclosed DY , 
factorial analysis, the writer found the 
and motion data of material assistance: 
was apparent from such analysis that the uel 
carrying the highest loadings involved ae con 
movement. This might readily have bel 
cluded from mere study of the tests them", od 
However, further study of the data pP" í wa? 
by the time study analysis showed that c 
more than mere manual movement. ; (hes? 
study data showed that in carrying 0Y nro! 
tasks it was necessary to exert carefu ° Thi 
ìn carrying out the manual movement: gm 
control involved the muscles being "? whe? 
order that motions could be stopPe nd to 
there was need to make alignment 
carefully guide the pencil in prescribe 


s Thurstone, bL. 


tests, 


the opinion 

provide 
" 

“space 


yea 


Op. cil. 1 alit 


arrell, W. A factor analysis of mech 


anica 
Psychometrika, 1940, 5. 


The MacQuarrie Test for Mechanical Ability. IV 29 


Further evidence to support this view was ob- 
tained from high speed camera photographs 
which showed the motion in operation and the 
control being exercised. It was on the basis of 
this evidence that the writer named this factor 
a “controlled manual movement.” 

Harrell,’ in his studies with the Mac- 
Quarrie, found a factor evidently similar to 
that of the writer. Harrell named this factor 
a “manual agility” factor. On the basis of the 
foregoing evidence, the present writer cannot 
accept the concept and connotation of this 
factor as being a “manual agility” factor. 


Table 2 
Pearsonian Correlations Compared with Correlations 
Computed on the Basis of Overlapping 
Elements 
Tracing Tapping Dotting 
Pear- Over- Pear- Over- Pear- Over- 
son lapping son lapping son lapping 
r £ r r LÀ r 


Tracing 

Tapping .48 at 

Dotting .55  .58 47 50 

Copying .44 39 31 54 34 A3 


As a result of this study it would appear to 
the writer that the technique of time and 
motion analysis might be of considerable assist- 
ance in helping to analyze performance tests 
for the purpose of identifying the factors in- 
volved in such tests. It would seem that the 
area of mechanical dexterity would be a fruitful 
field to explore. One should not lose sight of 
the fact that time and motion analysis will 
probably tend to show a greater number of 
elements in operation than does the statistical 
factorial method. This can be seen in Table 2 
of the present study where some eleven ele- 
ments are shown as compared with the three 
factors disclosed by factor analysis. This may 
be due primarily to the fact that the elements 


"Harrell, W. Of. cit. 


of time and motion analysis are more minutely 
broken down and that there may be various 
combinations of these elements in a single 
task. However, there is no available evidence 
to prove or disprove this contention. More 
important is the fact that time study elements 
should not be considered as being equal or com- 
parable to the so-called psychological factors 
disclosed by factor analysis. 


Summary 


This study has presented the findings of a 
time and motion analysis of four of the sub- 
tests of the MacQuarrie Mechanical Ability 
Test. The purpose of the study was to de- 
termine whether time and motion analysis 
could assist in the interpretation of the factors 
being measured by the test. 

On the basis of the findings of this study 
the following conclusions might be warranted: 

1. Correlations computed on the basis of 
overlapping time and motion elements give, in 
four of the six cases, r’s closely similar to their 
Pearsonian r’s. Deviations in four cases range 
from .00 to .09. The fifth and sixth deviations 
were .21 and.23. It would be mere speculation 
on the part of the writer to explain these larger 
deviations, 

2. There is close agreement in the analysis of 
the overlapping elements as produced by the 
factor analysis and by the time and motion 
analysis. The factor analysis identified a 
space factor, and a manual movement factor. 
The time and motion analysis revealed a visual 
inspection factor and a controlled manual 
movement factor. The space factor, which 
appears to be more psychological in nature, was 
not revealed by the time and motion analysis. 

3. Time and motion analysis technique 
appears to have the possibility of being a valu- 
able adjunct for gaining analytical insight into 
psychological tasks involving manual move- 
ments. 


Received April 16, 1949. 


The Pre-Engineering Inventory as a Predictor of Success in 
Engineering Colleges * 


Frederic Lord and John T. Cowles 


Educational Testing Service 


and 


Manuel Cynamon 
Brooklyn College 


The Pre-Engineering Inventory is a battery 
of seven objective tests, designed primarily to 
assist in the selection of those students who 
will be most likely to succeed in engineering 
Schools. This report summarizes a group of 
related studies on the reliability of this battery 
of tests and on their predictive efficiency, as 
evidenced particularly by correlations of the 
test scores with various measures of engineering 
school achievement. 

The Pre-Engineering Inu 
for use both as an instru 
entering students and for the guidance of 
students before or during the first two years of 
undergraduate engineering studies. It is the 
purpose of the Pre-En 


gineering Inventory to 
Supply measures of only those aptitudes or 


entory is intended 
ment for selecting 


edyard R 
ledge the able 


activities of the 
raduate Record 


tucational Testi; 
guide the policies anq 


30 


: zimmer g »er- 
ords, recommendations by teachers, and Į 
sonal interviews. 


Description of Tests and Scores 

The present form of the Pre-Engineerinf 
Inventory, known as Revised Form A, was ii 
made available in 1944, and has been used 1 
two principal types of testing program s a 
national program administered on fixed M 
at widespread testing centers, and an ie tes 
tional program administered on varying t 
by Cooperating institutions. The n 
Program was designed for testing Vi wn 
for entrance to engineering schools, the in$ g 
tional program for testing students alre? 
enrolled in those institutions. Pre 

The difficulty level of the tests of the Fie 
Engineering Inventory has been adjusted we 
range of capacity of a base sample of pre sa 
freshmen in a broadly representative pet 
accredited engineering schools; the D used 
form of the Inventory is not intended to be 
in high schools or liberal arts colleges- 

The tests of the Pre-Engineering T s 
are contained in two booklets, with S€P and 
answer sheets adapted to machine OT cive 
Scoring. The questions are of the pon 1 
multiple-choice type and include passé tion 
quiring interpretation, problems for ee oP 
diagrams to comprehend, and que dl g 
specific information. Emphasis is bw ini. 
essential aptitudes, skills, fundamenta g of 
mation, and the discovery or appli?" ory 
Principles, rather than on mere factual stro 
or achievement in specific courses of patte 
tion. The seven tests included in the P" 


plor) 


Pre-Engineering Inventory as Prediclor of Success 31 


Table 1 


Reliabilities of Raw Scores on the Pre-Engineering Inventory Tests for Five Schools Participating in the 
Fall 1947 Testing and for a Random Sample from the April 1948 National Program 


School 
- National 
Tests v w X Y Z Program 
I. General Verbal 93 94 93 93 94 94 
Ability (100 items) 
II. Technical Verbal : 92 94 .90 .85 91 92 
Ability (90 items) 
HI. Ability to Comprehend 9t .88 .89 .89 OL .88 
Scientific Materials 
(100 items) 
IV. General Mathematical .90 91 89 .85 85 .89 
Ability (90 items) 
V. Ability to Comprehend 87 91 75 84 75 EL 
Mechanical Principles 
(52 items) 
VI. Spatial Visualizing .89 .88 .91 .90 .90 92 
Ability (56 items) 
VIL. Understanding of 81 83 46 .82 .86 .80 
Modern Society 
(70 items) 
Number of Cases 178 128 172 181 98 366 


are? Test I, General Verbal Ability; Test II, 
Technical Verbal Ability; Test III, Ability to 
Comprehend Scientific Materials; Test IV, 
General Mathematical Ability; Test V, Ability 
to Comprehend Mechanical Principles; Test 
VI, Spatial Visualizing Ability; and Test VII, 
Understahding of Modern Society. 

In addition to the separate scores for the 
seven tests of the Inventory, an eighth score, the 
Composite Score, is obtained by adding together 
the raw scores of Tests II, III, and IV. This 
Composite Score represents a general verbal 
and quantitative aptitude score. Preliminary 
studies on experimental forms of the test indi- 
cated that, for practical purposes, this Compos- 
ile Score was the best index obtainable from 
the entire battery of the candidate’s general 
aptitude for engineering study; it was one of 
the purposes of the present study to reexamine 
this preliminary finding concerning the Compos- 
ite Score. 

? See K. W. Vaughn, The Pre-Engineering Inventory, 
J. Engng. Educ., 1944, 34, 615-625, for a description of 


the separate tests and the criteria by which they were 
chosen, 


Reliability of the Tests 


Reliability coefficients are given in Table 1 
for a random sample group taken from the 
April 1948 nationwide testing of applicants to 
engineering schools, and also for recently en- 
rolled freshmen students in five schools where 
testing was carried out during the fall of 1947. 
These coefficients are based on a division of 
each test into “rational halves.” In making 
this division any group of items based on a 
single paragraph of reading matter or on a 
single chart or table was treated as an indi- 
visible unit. An attempt was made to match 
the halves, in so far as possible, with respect to 
item difficulty, subject matter, item type, and 
position in the test. 

The reliability coefficients were obtained 
from the correlation between half-test scores 
by the application of the Spearman-Brown 
Formula (modified so as to take into account 
the fact that the two “halves” of the test did 
not necessarily contain equal numbers of 
items). Since the number of groups of items 
in a single test was not always large, it was 
often impossible to achieve satisfactory match- 


32 Frederic Lord, John T. Cowles, and Manuel Cynamon 


ing of the test items. As a result, the relia- 
bility coefficients presented probably tend to 
be underestimates of the actual test reliabili- 
ties in many cases. 

Test V (Ability to Comprehend Mechanical 
Principles) and Test VII (Understanding of 
Modern Society) do not have as consistently 
high reliabilities as would be desired. The 
level of reliability of the other five tests appears 
to be sufficiently high to justify a good measure 
of confidence in the scores. It may be stated 
that there is a strong tendency in these data 
for a test to have the highest reliability coef- 
ficients in those groups having the highest mean 
or the greatest variability of scores. (It would 
be desirable both in Table 1 and in most 
of the subsequent tables to present the mean 
and standard deviation of each of the different 
groups studied on each of the eight Inventory 
scores, but space considerations make it unde- 
sirable to present all relevant data. The 
schools and other groups studied here and 
throughout this report differ very greatly 
among each other with respect to mean score 
and with respect to variability of scores on 
any one test.) 

Since the Composite Score is used much more 
than the score on any single test, its reliability 
is of greater importance than those presented in 
the table. The Composite Score reliability for 
the 366 cases in the April 1948 National Pro- 
gram group is 0.95. This is a highly satis- 
factory value. The comparable coefficients for 
the different school groups in Table 1 have not 
been calculated, but the general level of the 
test reliabilities indicates that the Composite 
Score reliabilities for the several schools could 


be expected to fluctuate in the neighborhood 
of 0.95, 


Twelve-School Study? 


Studies were made of the correlations of 
Inventory scores with achievement re 
other relevant data for the freshma 
Some cases, the sophomore classes 


cords and 

n and, in 

in twelve 
* This twelve-school study w 

à Y was planned by K. W, 

Vaughn as Director of the Ò : 


a Graduate Record Office and 
was carried through largely under his direction. Föt 
anlier validity figures see K. W, Vaughn. Basic con- 
siderations in a program of freshman ev i 

T. Engng. Educ., 1944, 35, 161-179, — ^ CYaluation. 


engineering colleges.* The test scores for these 
students had been obtained by testing the en- 
tire freshman class shortly after enrollment. 
The names of the institutions are as follows: 
California Institute of Technology; Carnegie 
Institute of Technology; Columbia University; 
Georgia School of Technology; ageret 
Institute of Technology; Newark College at 
Engineering; North Carolina State College 0 
Agriculture and Engineering; Oklahoma Agri- 
cultural and Mechanical College; Oregon State 
College; University of California at Ai 
Angeles; University of Michigan; and gd 
versity of Texas. In two cases, two p 
groups of students from the same school ps 
analyzed separately because they took s 
similar courses, so that fourteen, rather p 
twelve, "school groups" are discussed in e 
present report. In reporting results the a - 
tity of the schools will not be disclosed. e to 
of testing—which cover the period from 1° ail be 
1946—and also numbers of individuals w! ies 
presented with the other data in the t 
which follow. . 
Care should be taken in considering 
relation or lack of correlation of e SE 
Engineering Inventory scores with school £ a e 
A score on one of the Inventory tests may hool 
little usefulness for predicting average "un 
grades but may nevertheless prove to be val 
useful for guidance or for selecting ibin ^ 
with good potentialities for success after Finder 
ation from engineering school. The ample 
standing of Modern Society test, for ^ with 
should not be expected to correlate highly 
engineering school average grades. :ato 20 
Another fact that must be taken s statis 
count is that school grades are rarely 25 
tically reliable as the scores on objective 
Hence correlations with school grades 
present attenuated estimates of mee 
Moreover the necessity of utilizing ey) eng” 
students who actually were admitted f 
neering school and persisted through gi 
more semesters of study limits the rang, h 
ent to an upper range, as compared W 
total range of talent in the group fro 


the col 
Pre 


s beg” g 
e was b 

+A year after the study reported hee ts tested n) 

data on the academic achievement of stud ss additio y 

freshmen were obtained from twenty-two been ? 

engineering colleges. These data have n 

lyzed to date. 


— 


Pre-Engineering Inventory as Predictor of Success 


selection was originally made. This definitely 
reduces the amount of correlation which is ob- 
tained, particularly if the degree of selection 
is great on those characteristics measured by 
the selection test. The present study, there- 
fore, is limited to demonstrating how well the 
Pr e-Engineering Inventory correlates with the 
engineering school grades of students already 
admitted to engineering school. It does not 
demonstrate how well the Pre-Engineering In- 
ventory selects from among all engineering 
applicants those who will obtain the best engi- 
neering school grades, nor how well the In- 
ventory would predict an ideal measure of 
engineering school success above and beyond 
course grades, including the examinee’s later 
work as a practicing engineer. 

1. Correlations of the Test Scores with Aver- 
age Grades. One of the main purposes of this 
study was to determine the validity of the Com- 
posite Score for predicting engineering school 
success as measured by the various average 
grades available for enrolled students. The 


33 


Composite Score validities were obtained by 
computing the correlations between the Compo- 
site Score and the various available averages of 
course grades. The correlations between scores 
on the individual Pre-Engineering Inventory 
tests and average grades were also obtained in 
order to determine how scores on the separate 
tests might be related to engineering school suc- 
cess. These latter correlations should not be 
considered as validity coefficients since it is not 
intended that any single test score should be 
used to predict average grades in engineering 
school. 

The correlations of each test score and of 
the Composite Score with average first-term 
grades are presented in Table 2; the Composite 
Score validities for school groups having more 
than one term of achievement records are pre- 
sented in Table 3. It will be noted that Table 
2 shows a median correlation of 0.60 between 
Composite Score and average first-term grades, 
a very satisfactory value. Except where 
otherwise indicated by footnotes, all correla- 


Table 2 


Correlations of Pre-Engineering Inventory Scores with Average First-Term Grades 
(July 1944 to September 1946 Testings) 


Test I TestII TestIII TestIV Test V Test VI Test VII Com- 

Gen- Tech- Compre- General Compre- Spatial Under- posite 

eral nical hension Mathe- hension Visual- standing Score 

School No. Verbal Verbal Scientific matical Mechanical izing Modern (I+II 

Group -Cases Ability Ability Materials Ability Principles Ability Society -HV) 
A 285 E ET E 58 38 42 53 68 
5 176 42 52 65 67 ES 35 ES 67 
S 75 .24* AT 53 65 E m d FU 
D 403 39 50 58 .63 55 36 "d "E 
E 52 37 43 E 71 35* " ape. o 
F 79 .50 52 .50 63 40 32 246 P 
G 391 35 A8 56 58 35 35 3 "a 
H 228 39 46 50 58 42 37 “ag E 
I 189 x ET 27 46 eee Pa F^ 
J 87 16** E 44 .60 83 .36 .25* 51 
K 84 34 30 45 E 36 30 p S 
L 333 34 38 AS 42 37 28 30 ds 
M 195 14* 25 Al St -30 2 28 dA 
N 100 .20* 29 .38 38 i228 .16** 33 38 
Median 35 48 .50 58 37 35 40 P 


* There is more than 1 chance in 100, but no more than 1 chance in 20, that a correlation as large as this could 


C pocbetrom sampling wem that a correlation as large as this could arise solely from sampling fluc- 


" ** There is more than 1 chance in 
uations, 
*** This test was not given at this school. 


34 Frederic Lord, John T. Cowles, and Manuel Cynamon 
Table 3 
hi iev ds Covering More 
idi ite S for School Groups for Which Achievement Recor g 
Tendik Gompositespae Than One Term Were Available 
(July 1944 to November 1945 Testings) mu me PEE 
Average Grades ost esac 
Three-Term 
lst-Term 2nd-Term Two-Term I siii 
z z A "Nd. Jo, 0| 
School Valid- — No. of Valid- — No. of Valid- — No. oí Valid p 
Group ity Cases ity Cases ity Cases E ity 
A .68 (285) 56 (208) .65 (208) 
B .67 (176) 70 (84) 
D .66 (403) A3 (338) 53 (338) n 4 
G 61 (391) 30 (293) 38 (94) 
H 59 (228) 51 (195) 58 (195) E (n 
K 50 (84) 63 (68) 58 
L A8 (333) 50 (274) 
M A4 (195) 52 (156) 


tions in these tables are significant at the one 
per cent level—there is less than one chance 
in one hundred that correlations as large as 
these could arise solely from sampling fluctua- 
tions. The Composite Score validities are with- 
out exception significant in this sense. 

Within a number of the school groups there 
was great variability in the course of study 
taken by different students—a situation that 
inevitably reduces the Composite Score validity 
coefficients, since the average grades do not 


have a consistent meaning from student © 
student. Another situation that reduces up 
validity of the Composite Score in some qe 
is the inclusion in the grade average 9 sence) 
subjects as foreign languages, military sel m 
social sciences, ethics, Bible, music, 2" 

forth. . veals 
As would be expected, investigation I itua 
a tendency for school groups where such sditics 
tions exist to have Composite Score vali 


ive Compa hools 
below the median validity for all 5€ 
Table 4 
Median Pre-Engineering Inventory Test Intercorrelations for Fourteen School Groups 
= ———— Composit? 
Scot qv) 
I Ho om oom v vi vu gn 
I. General Verbal .63 .63 38 32 20 66 e 
Ability f l 7 
IL. Technical Verbal 63 70 5 * 
E E E 158 
Ability - =. P j i 
III. Ability to Comprehend ^63 70 5 , 
i Scientific Materials da = = 4 
- General Math i 4 
Ability athematical 38 56 -10 .55 .50 49 
V. Ability to Comprehend 3 ^ 
32 AL 4 5 
- Mechanical Principles " » - - 0 
- Spatial Visualizin " 
Ability g 20 32 St 50 54 .26 
VII. Understandin 9 
g of 6 ! 
Modern Society i €! » ii = ih 
Composite Score 
-60 
m 87 91 84 59 50 64 


—— eee 


ne — — 
. a IQ 
See 


School groups M and N of Table 2—the two 
groups with the lowest validities—are both 
characterized by great variability in the course 
of study pursued by different students. School 
group Land school group M both consist of two 
or more distinct groups of students matricu- 
lating at different times of year, and this fact 
probably accounts in part for the low validities 
obtained for these groups. Group K was a 
summer-school group that did not take a full 
term’s work during their first “term” of study 
—a fact that would tend to reduce the relia- 


Pre-Engineering Inventory as Predictor of Success 


35 


bility of the average grade and thus reduce the 
Composite Score validity for this group. 

Of the groups with first-term validities be- 
low the median, four (H, K, L, M) have Com- 
posite Score validities for more than one term 
of work. In school group H the cumulative 
two-term validity is very slightly lower than 
the first-term validity. In the other three 
school groups the validities for predicting the 
two-term averages are higher than those for 
predicting the first-term averages. 


2. The Test Intercorrelations. Relation- 


Table 5 


Comparison of Validity 


Coeflicients Obtained for Optimally Weighted Averages of Various Tests with the 
Coefficients Obtained for a Single Pre-Engineering Inventory Score 


Optimal Weights** 


E E EE E HE z E] 

8 $ S 08 gp O3 NER | 
z % = 3 S38 22 38 Ss EB 
Se = 3 E EO] ea >a g aa no 
2$ z 5.o f. STIS P3 325 ae 26 

= E d 3 E bi = 

E P3 — 325 62° éJ 222 $2 343 Gy I 

5 a 5 2 ER z T & E] 

a Be SO E- J2 sS 6 wae ee - 

^ 3 O44 

A Composite 678 

^ 2,3,4 684 A4 37 26 

A 2,3, 4,7 696 A2 29 24 = = AT 

^ All tests .690 —.01 42 28 22 —.03 09 48 

b ty 632 

D Composite .659 

D 2,3, 4 675 AT 46 44 

D All tests .678 —.05 19 A3 Al 09 01 E 

= 3 508 

$ Composite 534 

L 2,3,4 .537 14 28 48 

L All tests 554 —01 AT 14 AT —.02 05 19 

M 4 .537 

M Composite 497 

M 2,3,4 549 04 13 

M ^ 2347 580 .03 02 a — 4s 

M 1, 4,5 542 05 = — —.08 

M 12357 E Ap aa = ts E "m 

M All tests 584 —.19 40 ad —.21 .03 24 


case exactly the same as the group used in 


hich Table 2 is based is not in every 
between the correlations obtained in some 


* Thi dents on w! à 
p gronn ot ae Uh This fact causes a slight discrepancy 


the multiple correlation study. 
cases, 
** The weights given are mu 


Scores have b lized. À : 
Te School’ did not administer Understanding of Modern Society. 


ltiple regression weights for the case when the standard deviations of the test 


36 Frederic Lord, John T. Cowles, and Manuel Cynamon 


ships among the individual tests are shown in 
Table 4, which presents for each possible pair 
of tests the median of the fourteen separate 
intercorrelations obtained for the : fourteen 
separate school groups. All the median corre- 
lations are of such magnitude that they may be 
taken to represent real relationships among 
the tests. 


Multiple Correlation Study 


The three tests included in the Composite 
Score— Technical Verbal Ability, A bility to Com- 
prehend Scientific Materials, and General M. athe- 
malical Ability—are in general the single tests 
most predictive of engineering college grades. 
This may be seen by reference to Table 2 in 
which the median of the correlations of each 
test with first-term average grades is presented. 
However this is not true for every group. It 
can be seen that for some schools the General 
Verbal Ability, Ability to Comprehend Mechan- 
ical Principles, and Understanding of Modern 
Society scores had higher correlations with first- 
term average grades than did Technical Verbal 
Ability. This raises the question as to whether 
or not some alternative combination of tests 
would provide a better composite score for 
predicting average grades. 


It was further found in Table 2 


2 that in 
five of the fourteen school groups the corre- 


lation with first-term average grades was 
posite Score than for 

General Mathematical Ability. In two addi- 
elation with average 
e level for both Com- 
l Mathematical Ability, 
For these groups some other weighted average 
oubtedly have provided 

verage grades than did 


A multiple correlation stud 
undertaken to obtai 
or not som 


Engineerin be pref. 


er; i 
M ki le Score now in use. 
Ur Schools, each with a large number of 


y was therefore 
n evidence as to Whether 


hools for 
For each School 
what weights should be 
the Pre-Engineering Tn- 


ventory test scores in order that the weighted 
average of all the tests should provide pr 
mathematically best possible. prediction ie 
average grades. The correlation of av eu 
grades with this optimally weighted average E 
all tests was computed as a measure of K 
effectiveness of the best prediction obtainable 
(see Table 5). This multiple correlation may 
be considered to be a validity coeficient. a 
As an example of how Table 5 may be p 
terpreted, it is seen that in School A the ar 
mum weights for Tests II, TII, IV, and VII 4 
in the ratio of 12:29:24:17. The corresponding 
weighted average of these tests correlates 0- ie 
with average grade, as compared to che wn 
parable value of 0.678 for Composite Scor oe 
In three of these schools it was found p 
the Composite Score functioned almost as Y as 
for predicting first-term average grade d 
would the optimally weighted average " 
all Pre-Engineering Inventory scores. - not 
fourth school the Composite Score did "m 
function at all adequately from this po! 
view. Two facts about this school vi 
relevant here: first, practically none o any 
students at this school took English, 9r ane 
related subject, during the first € was 
second, the group studied in this schoo asses 
composed of several entering freshman Ei d 
that had been combined in order to puce 
adequate number of cases for the stats 
analysis. |y for 
The weights obtained in Table 5 npP en 
the case when the scores on the tests have 


ard 
al stand? 
adjusted so that all scores have equal st 


"Table 6 
. . ; d " 
Optimal Relative Weights * for Raw Scores Pur- 
Pre-Engineering Inventory Tests for the 
pose of Predicting Average Grade 


hic? 


Ability to 


Technical Comprehen: 
Verbal Scientific 
School Ability Materials 
i 0.5 13 
D 0.7 0.6 
: 0.7 13 
hne "S 0.2 0.7 gen yf 
: mm that "qm 
am eS grlative weights are constructed 80 with are 


m n i 

pm sed p. as to facilitate compariso! which 

Tu Used to obtain the Com posite Scores 
:1.0:1,0. 


-= 


Pre-Engineering Inventory as Predictor of Succ 31 
Table 7 
Correlations Between Pre-Engineering Inventory Scores and Course Grades 
(403 Students tested July 1944) 
Virst- Course Grades 
Term = = T — 
Average Engineering 
Tests” Grade Calculus Chemistry Physics Drawing English 
I. General Verbal 39 24 37 32 15 37 
Ability ` 
lI. Technical Verbal .50 ER AS ER BU 32 
Ability 
HI, Ability to Comprehend SS ED 52 M al 33 
Scientific Materials 
IV. General Mathematical ROS 55 58 E 20 M 
Ability 
V. Ability to Comprehend 235 ESI AS M ED 26 
Mechanical Principles 
VI. Spatial Visualizing 36 24 38 EU E E 
Ability 
Composite Score .66 56 .00 59 23 37 


(IL-+-III-+-1V) 


* Test VII was not administered at this institution. 


deviations. "Table 6 presents the optimal rela- 
tive weights for application to actual raw 
scores. The data shown in Table 6 for the 
four schools studied give some indication that 
better prediction of first-term grades might be 
obtained if Technical Verbal Ability were less 
heavily weighted in obtaining the Composite 
Score. The improvement of prediction so ob- 
tained would probably be of little, if any, 
practical importance in the case of most schools. 
A few of the schools, however, would probably 
benefit appreciably if they could construct 
their own composite score in accordance with 
their local needs, Obviously no single compo- 
site score will provide optimum prediction for 
every school. 


Validity for Predicting Specific Course Grades? 
. The correlations between the Pre-Engineer- 
ing Inventory scores and first-term freshman 
grades in individual courses have been obtained 
for one of the larger colleges of engineering. 
The records of 403 students who took an 
identical course of study were available for 
this institution. Table 7 presents the correla- 
tions of the various test scores and of the Com- 


by and carried through 


5 Thi as pl: d 
ere We Vaughn as Director of the 


under the direction of K. W. 
Graduate Record Office. 


posite Score with grades in individual college 
subjects and with term averages. All correla- 
tions are statistically significant at the one per 
cent level. 


Validity for Predicting Achievement at 
the Sophomore Level 

The efficiency of the Pre-Engineering In- 
ventory tests for predicting engineering school 
achievement has also been investigated by ob- 
taining the correlations between the Pre- 
Engineering Inventory tests and the various 
Engineering Achievement Tests. The Engi- 
neering Achievement Tests are designed to meas- 
ure achievement in several specific branches 
of the beginning engineering curriculum. The 
seven tests in this battery measure the stu- 
dents’ knowledge of fundamental engineering 
principles and terminology and their applica- 
tion to the solution of specific problems. The 
Achievement Tests are available to colleges of 
engineering participating in the Measurement 
and Guidance Project in Engineering Educa- 
tion and are used primarily in the examination 
of regularly enrolled sophomore students at or 
near the end of the academic year. 

Data were available for 430 students in one 
school who took the Pre-Engineering Inventory 
in September 1946 and the Engineering Achieve- 


38 Frederic Lord, John T. Cowles, and Manuel Cynamon 
Table 8 
Correlations Between Pre-Engineering Inventory Scaled Scores and Engineering Achievement Test 
(Form B) Raw Scores 
(236 Students at School Q Tested in October 1946 and May 1948, and 430 Students 
at School R Tested in September 1946 and May 1948) 
Engineering Achievement Tests E 
n 
Pre-Engineering English Engi- General Physics ES i 
Inventory Expres- neering Chem- Mathe- (Mechan- Sound 
Tests School sion Drawing istry matics ics) EE 
= i ERES em 0 
I. General Verbal Q .70 .32 A6 35 38 39 # 
Ability R58 36 Kd 23 27 " 
m 8 
II. Technical Verbal Q 52 ES .66 30) AS cH A 
Ability R 37 42 AS 30 35 E 
53 
IIL. Ability to Comprehend Q 58 52 E A6 ES E f 
Scientific Materials R A AT Ao ERI 7 " 
IV. General Mathematical Q 52 35 51 H 55 A0 4 
Ability R 39 a 4l 51 51 * 
33 
V. Ability to Comprehend Q ER! 54 49 AT 54 40 E 
Mechanical Principles R 30 A6 38 39 53 i 3 
3 
VI. Spatial Visualizing Q 36 .52 83 36 38 20 "n 
Ability R 18 32 Br 2 36 ù di 
VIL. Understanding oi Q s 32 46 34 A3 Al * 
Modern Society R E 33 sar 24 39 * " 
oes Sore Q 62 49 67 46 58 ES b 
) R 5 50 34 5 37 * 
7 - x o gm 
* Not administered, 
" : -nding 
ment Tests in May 1948, and for 236 students in This study further substantiates find E 
a second school who took the J i ee in ge^. 
October 194 ane ^mwlory in that the Pre-Engineering Inventory i? 8” id 
ctober 6 and the Achievement Tesis in x 


May 1948. Thecorrelations appear in Table 8. 


re quite good in 
time as great as 
he two testings. 


| sented in 
8 is that the Composite Score is the 


hievement, 
Achievement 


u H $ 1 s 
and the Composite Score in particular ar 
predictors of engineering school succes? 


General Conclusions 


The results of the several test anaD^ 
scribed above indicate that the Pre-E ngin 
Inventory Composite Score on the ne 
forms very satisfactorily as a predictor 
neering school success, Of courses er 


de 

5 

E 
of 


eng 


gel 
dic 


. i p 
set of weights will not provide optimu™ vial! 


; E 
tion for every school. - The use of 4 


» i tha 
weighted combination of those tests 1", o 


à "y 0 
Prise the Composite Score, or possibly ^. 


€! (^ 
other tests, would undoubtedly @PP™ po“ 


H H B H 5. 
™prove the prediction in some scho? 


cor 


, a’) 


Pre-Engineering Inventory as Predictor of Success 39 


ever, it is not certain that the advantages 
gained by computing a special composite score 
would justify the expense involved. 

An examination of the correlations pre- 
sented here for the Pre-Engineering Inventory 
tests not included in the Composite Score is of 
value for understanding these tests and their 
relation to the engineering school curriculum. 
The utility of these tests cannot be unequiv- 


ocally evaluated by the approach used in the 
studies covered here, since these tests have 
functions beyond the prediction of college 
grades, particularly as aids in the guidance of 
the student and in the selection of broadly 
gifted individuals who will do credit to the 
profession after graduation. 


Received November 18, 19-49, 
Early publication. 


The Kuder Literary Scale as Related to Achievement in 
College English * 


A. Kimball Romney 


University of Wisconsin 


The relation between scores obtained on 
different types of interest inventories and 
achievement has been summarized and ap- 
praised elsewhere.! Most of the studies which 
examined the relation between Kuder interests 
and achievement showed positive relation- 
ships? It can be added that the studies were 
done in the main with rather small groups 
using, most frequently, college grades as a 
criterion of achievement. Whether results 
from larger groups using more refined methods 
would give different results has not been known. 

The present note is concerned with report- 
ing results obtained using a precise index of 
achievement on a fairly homogeneous group 
of over a thousand subjects, 

Specifically it concerns correlation data be- 
tween the Kuder literary score, ACE scores 
(American Council on Education Psychological 
Examination for College Freshmen, 1947 edi- 
tion), and achievement in college English 
classes. It is the nature of the achievement 
score that makes these data of special interest. 
They are noteworthy for the following reasons: 
(a) the group of subjects is composed of 1085 
(566 male, 519 female) freshman students who 
took English 1 Autumn Quarter at Brigham 
Young University in 1947; (b) since it included 
all new freshmen the results are somewhat free 
of many errors that might have been introduced 
by sampling procedures; (c) the subjects were 
all exposed to as near th 
instruction as possible duri 
the achievement is judged not on the basis of a 
ES (as “A”, "B", etc.) but rather on the 
ed long (554 item), objective, carefully 

ered achievement test given at the 


* The author is ratef 
the C grateful to D. 


€ same amount of 
ng the quarter; (d) 


1 Super, D. E 
ork: H i E 
2 Sup 


Appra: 
arper a 
er pa Broth 


ising vocational fitn 
hers. 1949, a 
Ob. cit., pp. 457-458, ` 


New 


40 


aptitude 


end of the fall quarter; and (e) college né 


Mol zen into accou 
as measured by the ACE is taken into acco 
as an important variable. 


er n x s PEDRE adu 
Che English Achievement Examinatio! 


The test, which covered all the areas taught 
in English 1 during Autumn Quarter 1947, pE 
prepared by a faculty committee before zs 
beginning of the quarter. It consisted at 
items arranged into five sections: vocabuls di 
essays, short stories, grammar, and misce i 
neous. At the beginning of the quarter the 
English teachers recieved a letter from 
department head indicating the 
test, amount of time which would b «geri 
for each area, and when it would be Ps a 
This information was given to the studen ii 
the same time. Similar letters were 5°? 
at mid-term and three weeks before the C? 
the quarter. No teacher knew the exact 
tions and none of the teachers who taug 
the experiment had seen the test before ! 
administered. iden 

At the beginning of the quarter thes. s of 
were divided into ability areas on the po pers 
the Purdue English Placement Test. roun? 
were assigned to teach the respective a as 
ie. each taught some groups classi "a 
"poor," some classified as “intermediat? $ 
some groups classified as “good.” n 
tered the teachers over the entire classi! rp 
of groups equalizing the teaching - sam 
nearly as possible. All teachers used t? as 
text and the same units of work for all c lihe 

At the close of the quarter, poa 
teachers from the university were ca rrei n 
the English department and were 8" a 
structions as to the administration to? 
achievement test and it was administ€? s 
Students the same afternoon under 15 wh 
conditions. The teachers and pn rain 
administered it had all received speci? s un 
for that purpose and the test, which W^ 


do 
ues 
pc in 
was 


The Kuder Literary Scale and Achievement 41 


the supervision of the Counseling Service, was 
scored on International Business Machine 
equipment. 


Results 


The various correlation coefficients ob- 
tained are shown in Table 1. 

Tt can be seen that the Kuder literary scale 
has a small but statistically significant correla- 
tion with English achievement which is roughly 
3 for both male and female. "This does not 
rise significantly when possible effects of ACE 
are partialled out. 

The correlation between ACE and English 
achievement was high for both males and 
females, .69 and .84 respectively? 

With reference to the multiple R, it can be 
seen that no significant changes result by the 
addition of the Kuder literary scale. It is to 
be noted that the Kuder scale and the ACE 
are apparently quite independent of each other. 

It can be concluded that as far as these 
data are concerned the correlation. between 

? No explanation is offered for the difference in cocti- 
cients observed between male and female scores. 


Table 1 


Correlation Coeflicients between English Achievement, 
Kuder Literary Scale, and ACE 


Independent Variables 


Kuder 
Kuder with 

Dependent Plus = ACE(b) 
Variable Kuder ACE — ACE(a) Constant 
English 
Achievement 

Male 

(N=506) .272-.04 .602-.02 .714.02 284.04 
English 
Achievement 

Female 

(N=519) 294.04 .842-.01 854.01 


2925.04 


(a) Multiple correlation. 
(b) Partial correlation. 


achievement in a college English class and the 
Kuder literary scale is very low, even though 
statistically significant. 


Received October 5, 1040. 
Early publication. 


Scores on the Strong Vocational Interest Blank and the Kuder 
Preference Record in Relation to Self Ratings 


Ralph F. Berdie 


Student Counseling Bureau, Office of the Dean of Students, University of Minnesota 


Expressed interests and measured interests 
are far from identical. In terms of common 
elements, perhaps no more than between 25 
and 50 per cent of the factors associated with 
one are associated with the other. Various 
studies report correlation coefficients ranging 
about .50 between measured and expressed 
interests, depending upon the test used, the 
method employed for eliciting and classifying 
expressed interests and the sample. 

Ina study of the Strong Vocational Interest 
Blank scores of 1000 men who came toa Uni- 
versity Testing Bureau, Darley (5) reported 
contingency coefficients between claimed voca- 
tional choices and interest score 
ranging from .35 to .57. 
"Claimed vocational choices 
stituted for measured 
counseling.” 

Lalegar (11) 
along with the 
to determine th 


patterns 
He concluded, 
cannot be sub- 
interests in effective 


es of 703 
age group for which 
appropriate. She 
d choice of occupa- 
s may be considered 
» the lack of relation- 


ocational interest 
tribute materially 


g (9) report the relation- 
on the Kuder Preference 


Stoa Simple questionnaire 
vocational interests, 


nine scores on the Kuder 
d in order from one to ni 


een scores 
Record an 


€cord were ranke 


42 


and these rankings were correlated. with = 
orders in which the occupations were ie 
by the students. The authors do not Qn 
how these correlation coefficients were 09- 
lained, but they state that for 115 boys 4 
grade 10B, the correlation between qe 
naires and. Kuder scores was .59 and tor 
girls in grade 10B, the correlation was 2 
Rose (14) gave the Kuder blank to 60 ' F 
selected” veterans referred to a vaat 
Advisement Unit. Each veteran was ol 
given nine cards. On each card were 17 0) as 
pations selected from Kuder’s manual (1 jects 
characteristic of a given area. The ee 
ranked the nine areas according to peeled 
for the occupation. ‘The coefficient of co cor- 
gency obtained was .61. The rank vp 
relation coefficients for each of the 60 ; rho 
ranged from —.05 to .99, with a media! 
of .64. x „dq the 
Crosby and Winsor (4) administerec and 
Kuder Preference Record to 222 men itur 
women sophomores in a college of agri 
and home economics. These studept eas 
an explanation of the type of interests ir 
ured by the test, were asked to estimate 
percentile scores, using the Kuder ! 
Sheets. The correlation coefficients be 
obtained scores and estimated scores ?' 
Seven scales ranged from .39 to -60- 
median r was .54. elatio!" 
Information is available about the P est® 
ship between the Kuder and the Sue low? 
(18), but no study reported to date ship 
direct comparisons between the p" the 
of the Kuder Record and self ratings a a3) 
Strong Blank and self ratings. Pate who 
Teporting the case of a returning wu. nine 
took both tests, raises a question Lien w 
the relative ease of “faking” the tes tori? 


inve 
Suggests: "The Kuder and Strong inve" > 
both yi 


1 o 
’s interest patterns, when 


; A | 
9-1 yield important information ? aine | 
mejindividua] 


Scores on the Strong Vocational Interest Blank 43 


in a guidance situation. However, in a selec- 
tion situation, it would appear that the Strong 
is to be preferred because it is more subtle 
and the vocational significance of liking or 
disliking each of the 400 items is not so readily 
apparent to the person taking the test." 
Longstaff (12) reports an investigation of 
the fakability of the Strong test and the Kuder 
test where 35 men and 24 women took the 
Strong test, men's form, and 37 men and 22 
women took the Kuder test. The subjects 
first took the tests under “normal” conditions 
and then took them with directions to raise 
their scores on certain scales and lower them on 
others. The results indicated both tests could 
be faked and that some scales were more 
fakable than others. The Strong test was 
easier to fake upwards, the Kuder test easier 
to fake downward. ý 
A theory of vocational interests with partic- 
ular reference to the Strong Blank has been 


Kuder Preference Record and self ratings, 
using the same sample. Does the picture a 
person has of his own interests correspond 
more closely to the picture given by the Strong 
Blank or to that given by the Kuder Record? 


Method 


Each man who came to the Student Coun- 
seling Bureau of the University of Minnesota 
during the first part of 1948 and who was to 
take an interest test was given the Strong test, 
the Kuder test and a self rating form. 

The rating form was of the graphic rating 
type and covered the following nine occupa- 
tional areas: (1) biological sciences; (2) artistic 
creation and appreciation; (3) physical sciences; 
(4) technical occupations; (5) social service; (6) 
musical occupations; (7) business detail; (8) 
selling; and (9) verbal or literary. For each 
area the graphic scale was identical. The 
following is an example: 


1. This occupational area centers about the 
biological sciences and includes medicine, 


dentistry (iti psychology. i 


Somewhat 


My interests are omew 
dissimilar 


very much unlike 
interests of people 
in this area 


discussed by Bordin (2). He presents the 
hypothesis that, “In answering a Strong Voca- 
tional Interest Test, an individual is expressing 
his acceptance of a particular view or concept 
of himself in terms of occupational stereo- 
types.” In support of this hypothesis, Bordin 
cites the relationship between claimed and 
measured interests and also the ability of sub- 
jects to manipulate or fake their vocational 
interest patterns. These phenomena can be 
explained making use of concepts other than 
those used by Bordin and we already have 
Suggested here, in reference to the relationship 
etween measured and claimed interests, the 
effect of common elements. 


Purpose 


The purpose of this investigation was to 
determine the relative agreement between 
Scores on the Strong Vocational Interest Blank 
and self ratings and between scores on the 


| 


No marked 
similarity or 
dissimilarity 


Somewhat 
similar 


My interests 
strongly resemble 
the interests of 
people in this area 


The order of presentation of the tests and 
the rating form was varied systematically so 
that one-third of the 500 men tested took the 
Strong test first, one-third took it as the second 
test and one-third took it as the last test in the 
series. Similarily, the order of presentation for 
the other test and rating form was rotated. 

The ages of the 500 men varied from 14 
years through 37 years, with a mean age of 20.8 
years, a median age of 20.6 years and a stand- 
ard deviation of 3.5 years. Only seven men 
were 16 years of age or younger, only 11 were 
30 years old or older. 

The largest single group, the pre-college 
group (N=195), consisted of people not yet 
registered within the University but planning 
on matriculating within one calendar year. 
The largest proportion of these people were 
completing or just had completed their senior 
year in high school. The non-college group 
(N=19) consisted of people who were not in 
residence in the University and who were not 


44 


planning on entering within the next year. 
The students in the College of Science, Litera- 
ture and the Arts (N— 126), the Institute of 
Technology (N=89), the College of Agricul- 
ture, Forestry and Home Economics (N= 13), 
the College of Education (N=3), and the 
College of Pharmacy (N=2) ranged from 


Ralph F. Berdie 


students beginning their first year of college 
to students completing their fourth year. 
Similarly, students from the remaining colleges 
tended to come from all classes within those 
colleges. Most of these 500 students, how- 
ever, were people who had completed less than 
two years of college. 


Table 1 


Categories Used in Classifying Scales of the Stro 
Number of Subjects Having Different 


" " e 
ng Blank, the Kuder Record and the Selí- Rating Form with th 
Types of Patterns or Scores in Each Area on Each Test 


Artistic 
Strong. Self-Rating Kuder 
(artist, architect) (item 2) (artistic scale) 
ena tet NN 
Type of Pattern Number Score Number Score Number 
primary pattern 13 5 55 75-100 120 
secondary pattern 15 4 136 65-74 48 
tertiary pattern 46 3 136 50-64 sT 
no pattern 426 2 91 0-49 2S 
1 82 
Strong Scientific— Biological 
(osteopath, physician, Self-Rating Kuder 
psychologist, dentist) (item 1) (scientific scale) 
E ON, NN 
Type of Pattern Number Score Number Score Numbet 
primary pattern 47 5 79 75-100 u 
Secondary pattern 38 E 193 65-74 > 
tertiary pattern 60 3 98 50-64 n 
no pattern 355 2 67 qu 
0-49 
1 63 
— Strong Scientific— Physical * 
mathematician, physicist Self-Ratin, 
: ; ] s g Kuder 
chemist, engineer) (item 3) (scientific scale) 3 
Type of Pattern Number 3 T e 
primary pattern 27 aee e Score 114 
Secondary pattern 7 4 e 75-100 46 
tertiary pattern 26 3 65-74 17 
no pattern 440 4 50-64 263 
2 71 0-49 
1 69 
(Farmer, aviator oe enter Technical 
Printer, M & P.S, teacher,” 
Policeman, forest service man) Self-Rating Kuder ale) 
(item 4) (mechanical scale. 
Type 9f Pattern Number Qin 
primary pattern 150 Score Number Score 91 
secondary pattern 72 2 98 75-100 AT 
tertiary pattern 95 4 169 65-74 52 
No pattern 183 3 111 50-64 304 
2 62 
0-49 
1 60 


Scores on the Strong Vocational Interest Blank 45 
Table 1 (Continued) 
Strong Social Service 
(YMCA phys. dir., personnel 
dir., pub. admin., YMCA secy., 
social science teacher, school Self-Rating Kuder 
supt., minister) (item 5) (social service scale) 

Type of Pattern Number Score Number Score Number 
primary pattern 71 5 91 75-100 246 
secondary pattern 57 4 173 65-74 40 
tertiary pattern 83 3 100 50-64 63 
no pattern 289 2 73 0-49 151 

1 63 
Musical 
Strong Self-Rating Kuder 
(Musician) (item 6) (musical scale) 

Type of Pattern Number Score Number Score Number 
primary pattern 74 5 51 75-100 175 
Secondary pattern 69 4 109 65-74 58 
tertiary pattern 77 3 104 50-64 66 
no pattern 280 2 95 0-49 206 

1 141 
Strong Business Office (Clerics) n Kuder 
(CPA, accountant, office mgr., Self-Rating (Computational and Clerical 
purch. agent, banker, mortician) (item 7) Scales) 
Number Number 
(compu-  (cleri- 

"Type of Pattern Number Score Number Score tational) ^ cal) 
primary pattern 81 5 70 75-100 151 112 
Secondary pattern 78 4 152 65-74 26 29 
tertiary pattern 96 3 118 50-64 55 56 
no pattern 245 2 83 0-49 268 303 

1 77 
—— dieit = —-- r 
S Sales 
tron, . 
(Sales manager, seal estate Self-Rating Kuder 

Sales, life insurance sales) (item 8) (Persuasive Scale) 
Type of Pattern Number Score Number Score Number 
primary pattern 128 5 6t 75-100 271 
Secondary pattern 94 4 156 65-74 46 
tertiary pattern 89 3 125 50-64 37 
no pattern 189 2 64 0-49 146 

1 93 
S Verbal—Literary 
Strong " 
(Advertising, lawyer, Self-Rating _ Kuder 
author journalist) (item 9) (Literary Scale) 
ss, 

Type of Pattern Number Score Number Score Number 
primary pattern 48 5 7 75-100 177 
Secondary pattern 51 4 157 65-74 55 
tertiary pattern 140 3 111 50-64 57 
no pattern 261 2 87 0-49 211 

1 


68 


46 


The 37 occupational scales of the Strong 
Blank were divided into nine groups, as shown 
in Table 1, and the 9 scales of the Kuder test 
were similarly classified. Two scales of the 
Kuder, computational and clerical, were clas- 
sified as Business-Office, and the Kuder Scien- 
tific scale was considered as being both “Bio- 
logical science” and “Physical science.” 

Thus, for each student were available 37 
Scores on the Strong, nine scores on the Kuder 
and nine scores on the self rating form. A 
pattern analysis (5) was made of the Strong 
Scores, according to the groups in Table 1. Tf 
within a group, a plurality of the scores were 
A's and B+’s, that was a primary pattern. 
If a plurality of scores were B-I-'s and B’s, that 
was a secondary pattern. If a plurality of 
Scores were B’s and B—’s, that was a tertiary 
pattern. All other groups were labeled “no 
pattern.” In case of the musician’s scale, an 
A was called a primary pattern, a B+, a 
secondary pattern, and a B, a tertiary pattern. 

A percentile score of 75 through 100 on the 
Kuder was called a primary pattern, a per- 
centile score of between 65, through 74, a 
secondary pattern and a percentile score of 
between 50 through 64 a tertiary pattern. 
All other scores were "no pattern,” The self 
ratings in each area were given values of from 


one to five, with five indicating greatest 
similarity of interests, 


Results 

The three indices of 
Compared on the 
Significant scores i 


Says that percentile Scores of 75 and above on 


us test are significant for purposes 
tional Counseling (10). 


Scores of A and B+ on his 


Kuder 


ingency Coefficient was 
J. (Bi Sychometric methods, 
k Co. 1 36, pp. 357-359, 


Ralph F. Berdie 


much greater on the Kuder test in eight : 
nine areas and in only one area, the sub- 


professional or technical, are there more signifi- — 


cant scores on the Strong test. 

On the Kuder test, 24 per cent of the group 
had significantly high scores on the "a 
scale, while on the Strong, only three per ye 
had significantly high scores. Only 15 vs 
cent had any type of interest pattern in a 
area on the Strong. The distribution of v 
ratings was more symmetrically ditur. 
with 11 per cent claiming much interest ! 
this area. t 

Distributions in the musical area are X 
what similar to those in the art area, wi 
per cent obtaining A's or B pluses on t 
musician's scale of the Strong and 35 per e 
obtaining percentile scores of 75 or more x 
the musical scale of the Kuder. On the a 
ratings, many more men indicated no m 
in this area than there were men who indica 
much interest. . few 

Again in the scientific areas, relatively a 
significant scores were found on the So 
profile, while many were found on the ee. 
Combining the biological and physical e ns 
groups on the Strong, only 74 primary p% cale 
were identified. On the Kuder scientific 5 of 
23 per cent received percentile scores of rvict 
above. The differences in the social Seving 
area are even greater, with 14 per cent hà ef 
primary patterns on the Strong and 
cent having high scores on the Kuder- 
trends are similar in the business and ! 
areas. 24 
In the technical and mechanical n d 
per cent had primary interest patterns cote? 
Strong and only 19 per cent had high e 
on the Kuder. In only one of the nine 
the sales area, were there more primary jar 
patterns than there were in this technic? f 15 
On the Kuder, fewer people had ap anf 
or more on the mechanical scale tha? 
other scale, 


ing 
Thus, on the Kuder, the areas havi” sa 


itera" | 


greatest number of high scores were reas 
(Persuasive) and the social service 47 ri of! 
on the Strong, the areas having most Pari 
patterns were the sales and the technic? 


i3 


if 
s 
The areas on the Kuder test having D dese D 


high scores were the mechanical an vi 
areas and on the Strong the areas b? 


Scores on the Strong Vocational Interest Blank 47 


fewest primary patterns were the artistic and 
the physical science areas. 

In terms of self ratings, most students rated 
themselves as very much interested in physical 
Science and technical work and fewest rated 
rated themselves as very interested in music 
and art, 

On the Strong test there was a total of 639 
primary patterns, or an average of 1.3 per 
Student and a total of 481 secondary patterns. 
Combining these figures, the average student 
had 2.2 primary and secondary patterns. On 
the Kuder, 1463 scores were at or above the 
75th percentile. The average student had 
2.9 scores which according to Kuder are signifi- 
cant in counseling. In the norm group upon 
which these percentile scores were based, for 
a group of 500 men, one would find 1125 per- 
centile scores of 75 or above out of the 4500 
Possible scores available on the nine scales of 
the test or an average of 2.3 significant scores 
per man. , 

The relationship found here between meas- 
ured and expressed interests approximates 
that reported in previous studies. The median 
Contingency coefficient between the Strong 
Test and self ratings was .43, between the 
Kuder test and self ratings, .52. Table 2 
Presents the coefficients showing the degree of 
relationship in each area between each of the 


T Table 2 


Contingency Coefficients Showing Relationships Be- 
tween Self Ratings of Vocational Interests and 
Scores on the Strong Vocational Interest 
Blank and on the Kuder Preference 


Record 

Occupational C with C with 

AGODA Strong Kuder 
Technical .55 AT 
Computational -61 ae 
hysical Sciences 32 46 
ocial Service 43 32 
Musical .39 .60 
Sales .58 58 
Biological Sciences -27 -30 
Verbal—Literary St 61 
Artistic 233 58 
Clerical (.61)* 32 


* This is the same statistic as for “Computational.” 


two tests and self ratings. The chi squares 
were all statistically significant beyond the one 


per cent level of probability. 


In two areas, the Strong scores are more 
closely related to self ratings than are the 
Kuder scores. In five areas, the Kuder scores 
are more closely related to self ratings. The 
range of contingency coefficients for the Strong 
is from .27 to .61, for the Kuder, from .30 to .61. 

On both tests, self ratings of interest in the 
biological sciences are least related to relevant 
scores on the tests. The next poorest agree- 
ment is found in the physical science area where 
self ratings have a relationship in terms of 
contingency coefficients to Kuder and Strong 
patterns of .46 and .32 respectively. Self . 
ratings in the sales area tend to be related to 
patterns on both tests. 

Disagreement is found in the clerical and 
computational areas. The patterns on the 
Business-Office group of the Strong tend to be 
accompanied by similar self ratings. The 
Kuder computational scores were only slightly 
related to these self ratings. The Kuder 
clerical scores were more related to clerical 
self ratings than were Kuder computational 
scores. 

In the case of those areas which have strong 
avocational implications, art and music, the 
relations between self ratings and Kuder scores 
are greater than those between self ratings and 
Strong scores. The explanation is perhaps that 
most students are unable to differentiate their 
thinking about vocational and avocational in- 
terests. Clinical experience suggests these two 
scales of the Kuder test are much more meas- 
ures of avocational interests than of voca- 
tional interests. 

When the results obtained here are com- 
pared to those reported by Crosby (4), the 
relative ease of self-estimating scores in the 
persuasive area is apparent. Crosby found, for 
two groups, correlations of .62 and .66 between 
Kuder persuasive scores and self-estimated 
scores in this area, as compared to a con- 
tingency coefficient of .58 found here. Simi- 
larly, in the literary area, Crosby’s correlations 
of .51 and .61 can be compared to the contin- 
gency coefficient of .61 obtained here between 
self ratings and the Kuder. 


48 


Discussion 


'The results presented here are in general 
agreement with the results obtained by other 
investigators and the correlation between 
measured and self-estimated interests approxi- 
mates .50. In agreement with Paterson's 
hypothesis concerning the relative subtlety of 
the two tests, scores on the Kuder tend to have 
a closer relationship to self ratings of interests 
than do the scores on the Strong. This may 
be a function not only of the items in the tests 
but also of the categories used in grouping the 
scales and defining the self ratings, although 
these categories were achieved through careful 
study of both tests. 

The men studied here found it relatively 
difficult to estimate their own scientific inter- 
ests, as they were measured by the tests, and 
on the other hand, the men were more able to 
estimate their persuasive and sales interests. 
In no occupational area, however, was there 
close enough agreement between measured in- 
terests and self-estimated interests to suggest 
that counseling can be done on the basis of one 
or the other. As long as measured interests 
have a relevancy for vocational satisfaction 
and as long as self-estimated interests play an 
important role in the vocational deliberations 
of individuals, both types of interests must be 
considered. 

The counselor working with 
Similar to those in this sample car 
find many more si 
by the test author: 


individuals 


1 n expect to 
gnificant Scores, as defined 


S, on the Kuder blank than 
Here each person averaged 
ove on the Kuder and only 
Son the Strong. This is a 


area, a “significant”? 
1S à more rare occurr 
Score on the Kuder. 


Score on the Strong test 
ence than a “significant” 


Ralph F. Berdie 


service work, as implied by their primary pat 
terns in group V on the Strong test. Ove 


one-half, however, will indicate they bear 8 | 


marked resemblance to the men in this bei 
as shown by their self ratings. pem 
enough, all three indices can be ace 
Strong discusses the predominant similari E 
of interests and would perhaps agree that n 
of these subjects are more like men in E 
Service than they are unlike them. 


j se | 
Strong blank does not take into account the 


ilizes the 
similarities, however, but rather, utilizes 


i ently: | 
differences between groups, and consequ 


relatively few of this sample obtain Wer 
patterns in the social service area. In ary 
words, the subjects who receive pr € 
patterns in group V share with the men ud 
standardization groups those interests W id 
tend to make them different from jects 
general We can only say that those su vice 
who obtain high scores on the social se hic 
scale of the Kuder are showing interests ke 
are shown by people in the social Lay l i 
many of these interests also being ^e 
common with men not in these arcas. reat! 
One explanation for the relatively : may 
ease of estimating scores on the Kude core 
thus be derived. In estimating Kur ritie’ 
the subject needs consider only his sim! e i 
to men in the defined groups, but in pe pow 
Strong scores, he needs to consider bot and 
he resembles men in the defined groUP 
also how he differs from men in gene! ely & 
These considerations do not adequate self 
plain the degree of agreement betwee ecol 
estimates of interests and test scores. the UP 
nizing that this agreement is reduced bY ncy of 
reliability of test scores and the inconss" ar 
self ratings, what are those factors WHC elf 
related to both interest scores a” 
estimates? dt the" 
Preliminary determinations of some an 3 
factors have been reported (1). = colle 
tensive study of 136 college student® "ate 
aptitude was found to be significan Hy 
to measured interests in engineering 
to expressed interests. ‘Those stude? z 
measured engineering interests partic ee 
fewer religious activities than did t hi$ 
dents with no measured interests. 
erence in activities was not foun 


£ 


rel 
u 


wi 
wey 


Stoup was analyzed on the basis 0 | 


Ln 


Scores on the Strong Vocational Interest Blank 49 


engineering interests. Preference for high 
School mathematics teachers was related to 
measured interests in engineering but not to 
expressed interests. Family background was 
related equally to both types of interests. 
When business interest groups were compared, 
morale scores on the Minnesota Personality 
Scale were related to expressed interests in 
business but not to measured interests, and 
frequency of “social” activities bore the same 
relationships. An elaboration of this type of 
Study would throw light on the differences be- 
tween measured and expressed interests. 

Other questions arise. Why is there a 
tendency for scores and self-estimates in some 
areas to be more closely related than in other 
areas? What distinguishes those people who 
can quite accurately estimate their interests, 
as measured by tests, from those who cannot? 

any questions remain to be answered con- 
cerning the relationships between measured 
and estimated interests. 


References 


i" Berdie, R. F. Factors associated with vocational 

interests. J. educ. Psychol., 1943, 34, 257-277. 
Bordin, E.S. A theory of vocational interests as 

dynamic phenomena. Educ, & Psychol. Meas., 
1943, 3, 40-65. 

3. Christensen, T. E. Some observations with re- 
spect to the Kuder Preference Record. J. educ. 
Res., 1946, 40, 96-107. u 

i Crosby, Ri C., and Winsor, A. L. The validity of 
Student estimates of their interests. J. appl. 
Psychol., 1941, 25, 408-414. 


5. Darley, J. G. Clinical aspects and inter pretation of 
the Strong Vocational Interest Blank. New York: 
Psychological Corporation, 1941. 

6. Diamond, S. The interpretation of interest pro- 
files. J. appl. Psychol., 1948, 32, 512-520. 

7. Gordon, H. C., and Herkness, W. W. Do voca- 
tional questionnaires yield consistent results? 
Occupations, 1942, 20, 424-429, 

8. Kelso, D. F., and Bordin, E. S. The ability to 
manipulate occupational stereotypes inherent in 
the Strong Vocational Interest Test. Amer. 
Psychol., 1948, 3, 352-353. 

9. Kopp, T., and Tussing, L. The vocational choices 
of high school students as related to scores on 
vocational interest inventories. Occupations, 
1947, 25, 334-339. 

10. Kuder, G. F. Kuder Preference Record, Revised 
Manual. Chicago: Science Research Associates, 
1946. 

11. Laleger, G. E. Vocational interests of high school 
girls. New York: Bureau of Publications, 
Teachers College, Columbia Univ., 1942. 

12. Longstafl, H. P. Fakability of the Strong Voca- 
tional Interest Blank and the Kuder Preference 
Record. J. appl. Psychol., 1948, 32, 360-369. 

13. Paterson, D. G. Vocational interest inventories in 
selection. Occupations, 1946, 25, 152-153. 

14. Rose, W. A comparison of relative interest in 
occupational groupings and activity interests as 
measured by the Kuder Preference Record. 
Occupations, 1948, 26, 302-307. I 

15. Stefllre, B. The reading difficulty of interest in- 
ventories. Occupations, 1947, 26, 95-96. 

16. Strong, E. K. Vocational interests of men and 
women. Stanford: Stanford Univ. Press, 1943. 

17. Thorndike, E. L. Adult interests. New York: 
Macmillan, 1935. 

18. Wittenborn, J. R., Triggs, F. O., and Feder, D. D. 
A comparison of interest measurement by the 
Kuder Preference Record and the Strong Voca- 
tional Interest Blanks for men and women. 
Educ. & Psychol. Meas., 1943, 3, 239-257. 


Visual Differentiation of Moving Objects 


Newell C. Kephart and Guy G. Besnard 
Division of Applied Psychology, Purdue University 


Much work has been done at the Occupa- 
tional Research Center, Purdue University, in 
cooperation with various industrial plants inan 
attempt to find skills which are pertinent to 
performance on various industrial jobs. At 
the present time employees of cooperating 
plants are tested and a statistical comparison 
of job performance and test scores is made. 
Recently, research has been attempted to see 
whether it would be possible to attack the 
problem from another angle by setting up in 
the laboratory counterparts of industrial jobs 
or parts of jobs. As one such approach the 
study of the visual differentiation of moving 
objects was selected. 


Experimental Procedure 


Description of Apparatus and Material. The appa- 
ratus consisted of an inclined trough down which clear 
glass spheres could be rolled. This trough was made 
of wood, 46 inches long and 1.5 inches wide. The 
inside was lined with white paper since preliminary 
experiments showed that a dark lining made the task 
too easy, A second trough underneath the first one 
and sloping in the opposite direction allowed the 
spheres to return to the experimenter in the original 
sequence (see Figure 1). 

Twenty clear glass marbles of the ordinary commer- 
cial type were provided. Of these, 10 were left plain 
and the remaining 10 were scratched with a sharp 
instrument so that three circles were on three circum- 


ferences in three perpendicular planes. For the pur- + 


pose of terminology in the test situation the unmarked 
spheres were called “good” and the etched spheres 
“bad? 

A head rest was provided to keep the subject’s head 
in a specified position during the experiment, so that 
each subject would view the spheres from approxi- 
mately the same distance and at approximately the 
same angle. In the early experiments, the spheres 


were released at the top of the trough and allowed to 


roll down by themselves. Since the spheres were im- 
mobile at the time of rel 


The room in which the experiment was conducted 
was kept in darkness except for a 10.5 inch, 40 watt | 
tubular Lumiline bulb, mounted in a clear tin reflector: 
This light was placed 12.5 inches above the table E 
directly above the last 18 inches of the trough. d 
keeping the different physical aspects of the mater zi 
and the apparatus constant we hoped to reduce our E k 
to one variable, that of discriminating between 
“good” and the “bad” moving spheres. 

Visual skills were measured with the Bausch & Lomi 
Ortho-Rater. This instrument provides measures ol 
12 visual skills which have been found on the bas? od 
scientific investigation to be important to success 
individual jobs. The 12 skills measured are: cuity 
Phoria Vertical; 2. Far Phoria Lateral; 3. Far A ity 
Both Eyes; 4. Far Acuity Right Eye; 5. Far / e f 
Left Eye; 6. Depth Perception; 7. Color Visio Eye; 
Near Acuity Both Eyes; 9. Near Acuity Right cali 
10. Near Acuity Left Eye; 11. Near Phoria Ve 
and 12. Near Phoria Lateral. . he roll- 

First Experiment. Each subject was given wide 
ing sphere test from two views, one from ede no 
the trough (where he could see the spheres ro è the | 
right to left in front of him) and one from & spher? 
end of the trough (where he could see the ive al | 
rolling toward him). These two trials were 8 4 
different times, usually on succeeding days. | ide view 

About half of the subjects were given the ^S! w) and | 
first (hereafter referred to as “side view B E. to. 
later were given the “end view” (hereafter T€ So the 
as “end view second”); the other half were E jie side 
end view (end view first) and later were, given ame fot) 
view (side view second). When each subject Cl syru, 


Š : > alins 
the first time, he was given the following verba 


tions about the experiment: 


th 
wn 
“Tn this experiment spheres will be rolled ana 


we ow 
such as this (show example); these sphere js (sh 
‘good.’ Other spheres are marked such “please | 


wW 
T will now ney 


50 


* Visual Differentiation of Moving Objects 51 


MARBLE 


TROUGH 


fo) 


ISe7C 


TOP VIEW 


SIDE VIEW 


D 3RPM MOTOR 
SCALE 1"=10" 


Fig. i. Top and side views of apparatus. 


and placed in the return trough, were rolled by one of 
the experimenters, one at a time for 8 sequences, 
making a total of 160 spheres for each of the two views. 
“ach sphere was released at the top of the trough at 
‘bout the time the previous sphere was passing under 
€ Screen and took approximately 2 econds to roll by 
© entire length of the trough, but was visible to the 
Su ect for only .6 of a second. T 
at orty-six members of a class in beginning psychology 
at Purdue University were used in the first experiment. 
ane O-Rater test scores were available for these stu- 
Hie Subjects were selected on the basis of their 
Mrs profile so that they would be fairly homogenous 
Pls Tegard to their near visual acuity and with as wide 
nge of phoria test scores as possible. ; 
ud Experiment. The essential differences be- 
addin the first and the second experiment were: 1. the 
t CORE of a motor to bring up the spheres and drop 
ge m in the trough automatically. The motor was 
ated to drop a sphere in the trough every 2.5 seconds. 
ec these spheres approximately 1 second to roll 
aba n the trough, but it was visible to the subject only 
t 4 second. 2. The subjects in this last experi- 
Scor, Were not selected on the basis of their visual 
matel, but were a random sample of a class of approxi- 
sequ Y 150. 3. The number of spheres was held to 4 
the ences of 20 spheres or 80 spheres altogether, since 
number experiment had shown this to be a sufficient 
er of trials for reliability. 1 
For E of the test procedure remained constant. 
chology se Subjects from a beginning course In psy- 
Were used in this experiment. 


Results 


(an a number of errors made by each subject 
‘Correct response or a response of “don’t 


know” were considered as errors) was com- 
puted. These computations were made for 
each view separately by orders (first or second). 
In addition to the total number of errors made 
by each individual there are also totals for 
each of the sequences of 20 spheres viewed by 
the subject. 

Reliability. Reliability was computed sepa- 
rately for the end and for the side view, but no 
differentiation was made, as to whether the 
view was the first or the second. Reliability 
was measured by the odd-even technique, 
taking as the “Odds” the odd numbered 
spheres of the odd numbered 20 sphere se- 
quences plus the even numbered spheres of 
the even numbered sequences, and as “Evens” 
the even numbered spheres of the odd se- 
quences plus the odd numbered spheres of the 
even sequences. Thus each sphere was in- 
cluded alternately in the “even” and “odd” 
for each view, since minor unintentional differ- 
ences in the etching of some of the marbles, or 
inherent flaws, might affect the appearance of 
individual spheres and therefore the scores of 
the subject. The resulting "odd-even" cor- 
rected correlation for the side view of the first 
experiment was r=.87. The corrected relia- 
bility of the end view was r=.95. The relia- 
bility for the side view was influenced signifi- 
cantly by one exceptional case. With this 


52 


itted, the corrected reliability was 
cole p r—.64. The apparently lower 
reliability of the side trial probably is, in large 
measure, due to the markedly lower variability 
of scores for the side due to the low number 
of errors. 

The reliability as shown in the second ex- 
periment was measured in the same manner as 
in the first experiment. The corrected coeffi- 
cient for the end view, in the second experi- 
ment, was .89. The corrected reliability of 
the side view for the second experiment was .89. 

It will be noted that in the second experi- 
ment the reliability of both the side and the 
end view are approximately the same. The 
higher reliability from the side view in the 
second experiment can probably be explained 
by the fact that the variability of scores from 
the side view was larger due to increased diffi- 
culty of the test. 

Differences Between End and Side Views. 
The mean number of misses made by each 
group on the end and on the side was com- 
puted. The mean number of misses made by 
all the subjects in the second experiment when 
looking at the spheres from the end view was 
14.97. Their mean number of misses from 
the side view was 8.48, The difference be- 
tween these means was 6.49. This difference 
is 5.31 times its standard error. 

This difference betwee 
cate that the discriminati 
is a much easier task when these objects are 
viewed from the side than when they are 


viewed from the end or coming toward the 
subject. 


The Pearsonian r w 
the score made on the e 
made on the side view, 
sulted in tw 


n means would indi- 
on of moving objects 


as computed between 
nd view and the score 

This breakdown re- 
The first group con- 
ok the order side view 


» end view second. The second group 
consisted of those w 


0 ho took the order end 
view first, side view 


f these coefficients, it 
ifferent factors are i 
f ing this task from th 
side and from the end. ` i 

Correlation with 


show possible relati 
test scores and vi 


Vision Tests, 


onship between 


sual skills as measured by 


Newell C. Kephart and Guy G. Besnard 


the Ortho-Rater, correlation coefficients were 
uted. | 
Both the Pearsonian r and an eta were com- / 
puted for all the acuities and the far depth 
perception since it was not known in advance | 
whether the correlation would be linear or 
curvelinear. In the case of the phoria scores 
both lateral and vertical, far and near, previous 
experience had shown that the relationship 
between success on a task and scores on these 
tests was curvelinear and therefore only etas 
vere computed. 
Phe Qe ee rs varied from —.09 A 
48. The correlation ratios varied from ‘ 
to .79. Since the correlation ratio is not k | 
highly reliable measure where the number by 
cases is small the correction suggested s 
Guilford was applied. This correction 73.4 
sulted in coefficients varying from .00 to € 
The majority of the coefficients are posta | 
We must not forget, however, that the nd 
of cases used in this experiment is cpi d 
small and that any generalization based "i 
careful study of the correlation coefficients | 
not be highly valid. jsual 
It would appear, however, that the V the 
abilities of the subject as measured by | 
Ortho-Rater tests are correlated in a low iid 
positive fashion with ability in discrimin? 
moving objects. 


Summary 


jls 
A test of ability to discriminate fine i£" 
in moving objects was constructed. a and 
involved discriminating between marke wert 
unmarked clear glass spheres as they tio? 
rolling down an incline. Two test con olié 
were investigated: 1. With the spheres pert 
toward the subject; and 2. With the SP 
rolling laterally past the subject. reli” | 
This technique was found to have x inf. 
bility of .89 to .95 when the spheres were ner? 
toward the subject and .89 when the SP 
were rolling past the subject. mig! 
It is felt that studies of this type o w 
well reveal desirable changes in desig? eff 
dustrial machines to take account e i" 
tively of the demands made upon the ™ ces”, 
operator. Biomechanics has been ur 9 
With the abilities and skills required ay ^i 
Operation of machines which are apt PW 
stalled and operating. It would appe* 


Visual Differentiation of Moving Objects 53 


relatively Simple laboratory studies of the 
PSychological and physiological implications 
While the machine is still in the design stage 
Could make contributions resulting in equip- 
ment which, when- finally installed, would be 
More efficient through an increase in the 
efficiency to be expected from the operator. 

he present study suggests the possibility of 
Such an approach. 


Scores on the moving object test were cor- 
related with the scores in the visual skills tests 
measured by the Ortho-Rater. These correla- 
tions would indicate that there is a low but 
relatively consistent positive relationship be- 
tween ability to differentiate fine details in 
moving objects and visual skill test scores. 


Received A pril 25, 1949, 


An Analysis of Visual Requirements in Industry * 


E. J. McCormick 


Occupational Research Center, Purdue University 


'The importance of certain visual skills to 
performance on industrial jobs has been em- 
phasized by the results reported from various 
investigations such as those conducted by 
Tiffin (6, 7, 8, 9, 10), Kephart (2, 3), and Stump 
(4, 5). To a considerable extent these and 
other investigations have been made with re- 
spect to employees on individual jobs; they 
have revealed in specific situations marked 
relationships between visual skills and various 
aspects of performance such as production, 
turnover, and accident experience. The sev- 
eral studies made on individual jobs have sug- 
gested the possible desirability of investigating 
the relationship between certain visual skills 
and performance on a variety of industrial jobs. 


Statement of the Problem 


The Occupational Research Center of the 
Division of Education and Applied Psychology, 
Purdue University, is engaged in the develop- 
ment of vision-test profiles for various indus- 
trial jobs. These profiles are used by indus- 
trial and business organizations for the dual 
purpose of aiding in employee selection and of 
identifying those individuals whose visual 
skills may not meet the general visual demands 
of their jobs and who might therefore benefit 
from professional eye care. 

In the past the visual profiles for jobs were 
individually adapted for specific jobs on the 
basis of the relationships for present employees 
between their visual skills as measured by the 
Ortho-Rater! and their criterion measurements. 
Some such relationships might be due to chance 
factors, however, since the vagaries of human 


the degree of Doctor of Phil 
dissertation was directed b 


! The Ortho-Ratej 


manufactured by th 


54 


traits are of such a character that in almost 
any single situation it is likely, by chance 
alone, that some statistically significant rela- 
tionship might show up; if such a relationship 
appears to be a logical one, there is the fame: 
tation to accept it without cross-validation o" 
a “hold-out” group. - zl 
The present investigation is an extensio! 
of research on the relationship between certain 
visual skills and job performance, directe 
toward providing a basis for establishing 
visual-skill profiles for industrial use m 
would avoid the potential pit-falls of v is 
vidually adapted profiles. In the light of e 
general objective two specific objectives W j 
established, namely: (1) The examination. 
the relationship between visual acuity and J É 
performance of employees on a e 
different jobs in order to determine any slit yi 
relationships that are of general appliċabi cu 
and (2) The possible development of a diffe M 
method for establishing visual acuity test ^ í 


5 o 
offs for visual profiles for certain type? 
industrial jobs. 


Basic Data | 


e 
" P " e th 
The basic data used in the study wr mens 
results of Ortho-Rater vision tests an 1 
ures of job performance on appro d E 
employees on 92 jobs in a number of in 7 p 
establishments. This information Wa cup? 
tained from the extensive files of the OC 
tional Research Center. pat 
sobs t 
Selection of Hold-Out Jobs. A sample of 51 a fil 
were in general randomly selected was drawn v D. 
of the Occupational Research Center to SS jobs V^; 
"hold-out" group. This sample of “hold-out i i D. 
drawn in order to be able to cross-validate PED p 
results of certain analyses (to be described wi w 
Were made on a group of “base” jobs. Only J E) 
about 20 or more employees were included. into v 
The 51 jobs were then empirically divided | the Po 
groups on the basis of the judged character © Gro? 
dominant visual demands. The first group ( ae wh 
included Predominantly near vision jobs—tho onst 
Ware considered to require close, and rather 4 rane 
visual attention within a relatively restrictec 


> 2 


An Analysis of Visual Requirements in Industry 55 


usually within arm's reach. Typical of the jobs in- 
cluded in this group arc hosiery loopers, sewing-machine 
operators, assemblers of small parts, and certain types 
of visual inspectors. 

The other group (Group 2) included jobs which re- 
quired varying degrees of both near and far visual 
attention, and might therefore be considered as combi- 
nation near and far vision jobs. Such jobs typically 
require some relatively close visual attention, say within 
arm’s reach, as well as some farther visual attention at 
Varying distances, either within or beyond the imme- 
diate work environment. Illustrative of jobs included 
în this group are knitting machine operators, weavers, 
Spinners, and punch press operators. 

Selection of Base Jobs. In order to make certain 
p "bes of analyses, a sample of similarly representative 

base” jobs was selected. A careful empirical exami- 
nation was made of the relationship between visual 
acuity and the performance criteria of employees on 
these jobs, ‘The only jobs retained were those on which 
it was empirically ascertained that there was a satis- 
factory degree of relationship between visual acuity and 
Job performance. (These base jobs were so “selected” 
in order to ascertain, by subsequent analyses, the 
Patterns of acuity skills which presumably contributed 
15 Performance on such jobs, with the thought that 
Such “patterns” could later be cross-validated on the 
hold-out jobs.) A total of 41 such base jobs was re- 
tained. These base jobs likewise were divided into 

Toup 1 (near vision jobs) and Group 2 (combination 
near and far vision jobs). om à 

Criteria of Job Performance. The criteria of job 
Performance varied from plant to plant and from job 
di Job, and were based on such factors as production 
a ta or earnings, or were the results of ratings, including 

A is ratings made by the paired comparison technique. 
vi oi instances the criterion measurements had been 
cri "Ced to four or five numerical categories. The 

terion categories for each job were divided into two 


ka Sups, usually of approximately the same size, which’ 


“ee then identified as “high criterion” employees and 
W criterion” employees. In instances where there 
rua à middle category which precluded an approxi- 
ately even division of the individuals into two groups, 
: s dle criterion category was omitted. " 
{Sion Tests, Ortha-Rater vision-test scores Wei 
jotilable on all employees of the “base” and *'hold-out" 
Visual he Ortho-Rater includes tests of the following 
lent Skills: (1) Far tests (given at the optical equiva- 
both: 26 feet): phoria, vertical; phoria, gae ee 
Perce yes; acuity, right eye; acuity, left eye; ed 
veh tion; and color discrimination; and (2) Near tes s 
ot. at the optical equivalent of 13 inches): acuity, 
eyes; acuity, right eye; acuity, left eye; phoria, 


Vertical. 
cal; and phoria, lateral. 


© six acuity tests were given particular attention 
acuity Present investigation. The results of the far 

A oth-eye test and of the near acuity, both-eye 
Were used directly. The results of the far acuity 


indi of the right and left eyes were compared for each 


» 
Score cual and only the lower, or “worse-eye u 
the peas used. A similar comparison Was aa 


“ar acuity tests for the right and left eye. 


four measures of visual acuity actually used were then: 
far acuity, both eyes; far acuity, worse eye; near acuity, 
both eyes; and near acuity, worse eye. 

The range of possible scores on these tests is from 
zero to 15.7 


Relationship Between Visual Acuity 
and Job Performance 


The over-all relationship between each of 
the four visual acuity tests and job performance 
is reflected by the proportion of individuals 
scoring at each score level who were high 
criterion employees. This basic relationship 
for each test is presented graphically in Figures 
1, 2, 3, and 4, which include the relationship 
for all of the 51 hold-out jobs combined, as well 
as for the Group 1 jobs and Group 2 jobs 
separately. 

Both the far and near acuity, both-eye 
graphs show (Figures 1 and 3), for all 51 jobs 
combined, a relatively marked increase in the 
proportion of high criterion employees at each 
successive score value in the lower score range, 
with a less marked, though still persistent, in- 
crease at each successive score level in the 
higher score range. When considering the 
Group 1 and Group 2 jobs separately, however, 
it will be observed for the far acuity, both-eye 
test, that the Group 1 jobs do not reflect the 
drop at the lower end of the test-score range 
that occurs for the Group 2 jobs. For the 
near acuity, both-eye test, both Group 1 and 
Group 2 have essentially the same pattern of 
relationship with job performance. 

For the two worse-eye tests (Figures 2 and 
4), the positive, relatively straight-line, rela- 
tionship from the low to high scores indicates 
that with higher worse-eye acuity (both near 
and far) the probabilities of satisfactory per- 
formance on jobs of the types studied increases 
quite constantly. A slight up-swing for the 
Group 1 jobs at the very low scores, especially 
for near acuity, worse eye, implies that for 
constant, close visual work a marked restriction 
of acuity is more conducive to job success 
than a relatively low level of acuity, presumably 
because the worse eye does not then hamper 
the effective use of the better eye as it might 
if the acuity level were only relatively low. 


_.2 For a conversion of these scores to other common 
measures of visual acuity the reader is referred to the 
manual of standard practice (11). 


E. J. McCormick 
56 


100 


PER CENT OF 
HIGH CRITERION EMPLOYEES 
e 
o 


?0 —-—- 16 NEAR JOBS 
7i 35 NEAR-FAR JOBS 
H —— ALL 5I JOBS 
10 i 
9071 2 3 4 8 € 1 $$ 6 116 3 


FAR ACUITY -BOTH EYES 


Fig.1. Per cent of high criterion employees to total 
employees at successive scores on Far Acuity, Both- 
Eye Test. 


Aside from this difference between Group 1 


and Group 2, however, the curves for the two 
groups are basically the same, 


Experimental Vision Test Profiles 


Previous mention has been made of some 
of the potential disadvantages of de 
visual standards for individual jobs on the 
basis of the relationship between the vision 
test scores and the performance criteria of 
employees on the jobs in question. 
to overcome these disadvantages, two di 
methods were developed for setting 
acuity standards for Jobs of the types i 
gated. These methods resulted in the 
lishment. of acuity-test cut-off scores 
were incorporated in “profiles” 

Acuity Profi 
Acuity Scores. 
a visual-acuity 


veloping 


To try 
ifferent 
visual- 
nvesti- 
estab- 
which 
for the jobs. 

le Derived from “Average” 
For each of the 41 base jobs 


profile was individually devel- 
oped which resulted in an ade 


differentiation between th 
low criterion employees. 

acuity tests, the average 
Cut-off scores which were 
individually-developed pr 


Eye Test, 


The whole scores nearest the averages for be 
respective tests were established as the mer 
scores for the first type of experimental pnm si 
This profile (designated as profile A «i 
the following cut-off scores: far acuity, = 
eyes, 8; far acuity, worse eye, 6; near ar y 
both eyes, 8; and near acuity, Worse eye, " 7 

Acuity Profiles Adjusted to the Visual an 
Level of Employees on the Job. The second D * 
of experimental acuity profile was designed A 
establish the cut-off scores on the four acul y 
tests on the basis of some standardized Ge inr 
ship to the general acuity level of the oa ee 
on the job in question. This “standar vr 
relationship" was developed by an analysis s 
data on the base jobs, for subsequent appi 
tion and cross-validation on the hold-out s e 
The specific procedures involved in the analy 
sis of the 41 base jobs are given below. 


1. Procedures used in analyzing data d 
each of four acuity tests (Note: these are ity 
scribed in terms of a single test, near m. ; 
both eyes, but were similar for all four ee y 

a. For each of the 41 base jobs, a freque m 
distribution was made of the scores of ^ 
employees on the near acuity, both-eye test- 


i 
100. 


PER CENT OF 
HIGH CRITERION EMPLOYEES 


6 NEAR JOBS 5 
35 NEAR-FAR J08 
—— att 51 08S 


14 
[E 
767175355355 5 
FAR ACUITY -WORSE EYE ool 
i , "un to 
Fig.2. Percent of high criterion employees 
employees 


at successive scores on Far Acuity, 


2 
or? 
y à 


An Analysis of Visual Requirements in Industry 51 


b. For each oí the 41 frequency distribu- 
tions certain percentiles were computed to the 
nearest whole-scores. The percentiles used 
were the 10th, 15th, 20th, 25th, and 30th, 
Since a preliminary investigation suggested 
that the percentiles in this range would most 
adequately serve the purposes of subsequent 
analyses, 

€. For each job an individualized visual 
acuity profile had been previously developed 
(with cut-off scores on the four acuity tests) 
Which was empirically judged to differentiate 
adequately between the high criterion and the 
low criterion employees. . 

d. For each job the difference in score units 
was determined between the both-eye near 
acuity cut-off score of the previously developed 
Profile and each of the specified percentiles on the 
test of the employees on the job. 

€. These differences in score units were then 
Summarized for all 41 jobs, there being a 
Separate summary of these differences for the 
10th, 15th, 20th, 25th, and 30th percentiles. 

ndependent summaries of the same type were 
also made for the Group 1 jobs and for the 
Group 2 jobs, Varying degrees of “stability 


100, 


90 


80 


2 
oa 


PER CENT OF 
HIGH CRITERION EMPLOYEES 
a 
[] 


40 


—-—- 16 NEAR JOBS 
— 35 NEAR-FAR JOBS 


— ALL 5! J085 


o n 12 13 14 15 


SRE S & eer Be 
NEAR ACUIT Y- BOTH EYES 
s iteri { al 
nS 3, Dor cent of high criterion employees to tafal 


m ity, Bot 
i, Ployees at successive scores on Near Acuity, Bo 
€ Test. 


100, 


* 90 


80 


PER CENT OF 
HIGH CRITERION EMPLOYEES 
a 
$ 8 8 ò 


w 
° 


7-548 NEAR JOBS 
t 35 NEAR-FAR ORS 
—— ALL 8i soss 


345267 8 9 10 n 12 
NEAR ACUITY- WORSE EYE 


O Ii 2 3 14 45 


Fig. 4. Per cent of high criterion employees to total 
employees at successive scores on Near Acuity, Worse- 


Eye Test. 


of these differences were noted. For the 
Group 1 jobs, for example, it was observed 
that for 15 of 20 jobs the cut-off score (of the 
individually developed profiles) was one score 
unit below the 25th percentile for the respective 
jobs. It was found that corresponding differ- 
ences for some percentiles were less stable. 

2. Procedures used in setting up formulas 
for experimental profiles of four acuity tests: 

a. Several experimental profile “formulas” 
were developed on the basis of the type of 
analysis outlined above. These profile for- 
mulas embodied for the four acuity tests the 
more "stable" of the relationships reflected in 
the differences between the cut-off scores of the 
individually adapted profiles of the base jobs 
and certain percentiles. Following is an illus- 
tration (for one test) of the way in which these 
relationships were converted into profile for- 
mulas for use in establishing cut-off scores on 
the various tests for individual jobs: using the 
example given in (e) above for the near acuity, 
both-eye test, the cut-off score on this test for 
a given job would be set at one score unit 
below the 25th percentile of the distribution of 
scores of the employees on this test. 


58 


For the several experimental profiles of this 
type the cut-off scores on the four acuity tests 
for a given job would be established in a 
similar manner; various combinations of such 
relationships were used in the different experi- 
mental profiles. Some profiles were developed 
for use on jobs of the type included in Group 1, 
and others for jobs of the type included in 

Group 2. These profiles were designated by 
letter, from B through K. : 

b. Some of the experimental profiles of this 
type excluded certain tests, and some specified 
lower limits for cut-off scores (either in the 
form of specified percentiles or specific scores). 
These experimental variations were suggested 
on the basis of more detailed analysis of the 
data on the 41 base jobs. 

Profiles of Phoria, Depth, and Color Tests. 
In order to examine the comparative results of 
the various experimental acuity profiles by 
themselves, and also when phoria, depth, and 
color tests were added to the battery, an aux- 
iliary profile which included a specified com- 
bination of cut-off scores on these tests was 
added to all of the acuity profiles. This aux- 
iliary profile (profile X) was developed in con- 
junction with a companion study (1), and 
consisted of the test cut-off scores which were 
most frequently included in the profiles of the 
base jobs as originally established in the regular 
procedures used by the Occupational Research 
Center? The combination of the X profile 
with an acuity profile is indicated by the 
addition of X to the alphabetical identification 
of the acuity profile, as BX, or KX. 


Application of Experimental Profiles 


The various experimental profiles, devel- 
oped from data on the base jobs, were applied 
to the /old-out job groups as indicated below 
for the purpose of cross-validation: Group 1 
jobs: profiles A, AX, B, BX, C, CX, D. DX, 
E, EX, F, FX, G, GX, H, HX; and Group 


? Profile X included the following cut-off scores: 


Far vertical phori v = 
ical phoria, and near vertical i 
Low-end cut-off (left hyperphoria) uns 1 
High-end cut-off (right hyperphoria) 9 
Far lateral phoria, and near lateral phoria 
Low-end cut-off (esophoria) 2 
High-end cut-off (exophoria) 14 
D Perception 
Color Discrimination i 


E. J. McCormick 


2 jobs: profiles A AX, B, BX, I, IX, J, IX, 
K, KX. ` 

The “A” and “AX” profiles included fixed 
cut-off scores on the various tests. For the 
other profiles (B through K and BX through 
KX) the acuity-test cut-off scores for the 
individual jobs were set by the formulas pre- 
viously described. . 

For each profile applied to a given job the 
test scores of each employee were examined; 
and each individual who failed one or more of 
the tests (whose test score on a given test was 
at or below the cut-off score for that test) was 
considered as failing the profile. 

For each profile applied to each job, the 
numbers of high criterion employees and of ^w. 
criterion employees who passed and who faile 
the profile were determined. The proportions 
of those passing and of those failing each 
profile were then determined, and the critica 
ratio of the difference between these propo 
tions was computed. 

The subsequent step involved the der 
mination of the effectiveness of each profi $ 
on the entire group of jobs to which it br 
applied. For this purpose an over-all critic 
ratio was computed.* 


Results of Experimental Profiles 


Table 1 presents the results of the sever? 
experimental profiles on the entire group, di 
jobs to which they were individually applie r 
This table includes for each profile the p 
all critical ratio previously mentioned. lies 
over-all critical ratio for a given profile -— 
the probability that the differentiation . p 
tween the high criterion and low criter! 
employees could be due to chance factors: é 

Acuity Profile. When examining the av 
all critical ratios of the acuity profiles at 
the hold-out jobs it will be observed 


fI . com 
* The over-all critical ratio for each profile was ico" 


puted from the following formula applicable for rep W: 


tions of similar experiments, as provided by HF. rau | 
Burr, Department of Mathematics, Purdue UniV* 
n; Mon, 
R= - CR; E i 
Mor, 1. 
N m 


H H P 
in which CR, stands for the critical ratio for aen 
combined, Mon, stands for the mean of the th? 


5 f 
ratios of the individual jobs, and N stands " 
number of jobs. 


An Analysis of Visual Requirements in Industry 


Table 1 


Summary of Results of Experimental Vision Test 


Profiles on 51 Hold-out Jobs 


Over-all Over-all 
Profile CR Profile CR 
Group 1 (16 near vision jobs) 
A 2.46 AX 4.13 
B 2.03 BX 4.23 
C 2.35 (ena 5.04 
D 1.69 DX 4.08 
E 2.86 EX 4.26 
F 2.81 FX 4.08 
G 1.97 Gx 4.17 
H 3.52 HX 5.16 
Group 2 (35 combination near and far vision jobs) 
A 7.07 AX 5.95 
B 8.75 BX 6.21 
I 7.31 IX 5.50 
J 9.14 JE 6.31 
K 8.10 KX 6.08 
Group 1 and 2 (31 Jobs) 
A 7.24 AX 24 
5 BX 7.51 


= 8.39 


they range from 1.69 (5 per cent confidence 
level) to 9.14, which is far beyond chance 
expectations, with most of them being above 
the 2 per cent level (2.33). In terms of the 
Over-all critical ratios, therefore, most of the 
acuity profiles may be considered as producing 
a differentiation that is beyond reasonable 
chance expectations. 
The various acuity profiles applied to the 
Toup 1 jobs all produced essentially the same 
results on that group; similarly, the several 
Profiles applied to the Group 2 jobs all gave 
*Pproximately the same results on this group 
ot Jobs. In general, however, the results for 
s Group 2 jobs are more pronounced than 
vr the Group 1 jobs. It was because of this 
erence in over-all results that more experi- 
trental profiles were developed and applied to 
e Group 1 jobs than to the Group 2 jobs. 
th With regard to the results of the profiles on 
"d Group 1 jobs, however, profile H (which 
rigs ades the far acuity tests and provides more 
gid standards for the near acuity tests) re- 


si in a somewhat greater degree of differ- 
ation than the other profiles. This implica- 
tance of 


lo: 3 z 
n of the relatively restricted impor 


Acuity on such jobs (in comparison with 


59 


near acuity) is in accord with the over-all rela- 
tionships previously discussed. 

The degree of differentiation between high 
criterion and low criterion employees that re- 
sulted from the experimental profiles might 
logically be considered as reflecting a minimum 
degree of the true relationship between the 
visual skills and job performance. Among the 
several possible factors that might contribute 
to this under-evaluation are the character of 
the criteria for some jobs, possible differences 
in illumination, differences in specific duties 
of various individuals included in the same job 
category, differences in age, and the possibility 
that certain jobs reflect a restricted range of 
talent due to the possible elimination after 
placement of some of the visually unfit. 

It is also pointed out that in the case of 
five of the Group 1 jobs and five of the Group 
2 jobs it was ascertained empirically that not 
even an individually adapted job profile would 
result in a reasonable degree of differentiation, 
perhaps due to inadequate criteria or chance 
factors. For such jobs it therefore could not 
be expected that a standardized profile or 
profile formula would result in a statistically 
significant differentiation. 

Profiles Including Acuity, Phoria, Depth, 
and Color Tests. In addition to the application 
of the various acuity profiles by themselves, 
the acuity profiles were also individually ap- 
plied with the addition of the auxiliary X 
profile, which includes a set of fixed scores for 
the phoria, depth, and color tests. The addi- 
tion of the X profile to the acuity profiles 
applied to the Group 1 hold-out jobs consis- 
tently resulted in additional discrimination. 

With the Group 2 jobs, however, the addi- 
tion of the X profile to the various acuity 
profiles resulted in a moderate decrease from 
the results obtained from the acuity profiles by 
themselves, although even then the over-all 
critical ratios were from 5.59 to 6.31. While 
this actual observed decrease was traced pri- 
marily to the systematic influence of a negative 
relationship for a very few specific jobs, there 
does not appear to be any evidence that the 
addition of the phoria, depth, and color tests 
adds particularly to the differentiation on 
Group 2 jobs over that obtainable with the 
acuity tests by themselves. Presumably the 
acuity tests tapped a good share of the meas- 


60 E. J. McCormick 


urable relationships between visual skills and 
performance on the jobs in this group. 

Since the phoria, depth, and color tests 
apparently contribute to the differentiation 
of Group 1 jobs over that produced by the 
acuity profiles, one inference that might logi- 
cally be made is that in the case of jobs which 
require relatively constant and close visual 
attention, the adequate adjustment of con- 
vergence as measured by the phoria tests, and 
possibly the ability to perceive depth relation- 
ships, presumably are related to satisfactory 
performance on such jobs. 


Implications 


The results of the investigation suggest 
certain factors which might be considered in 
the establishment of visual standards for jobs 
of the type studied. 

For example, the basic relationships be- 
tween the four visual acuity tests and job 
performance reflected in Figures 1, 2, 3, and 4 
imply that there should be a moderate lower 
limit for near acuity in both eyes for both near 
vision jobs (Group 1) and for combination 
near and far vision jobs (Group 2), and also a 
moderate lower limit for far acuity for com- 
bination near and far vision jobs (Group 2). 
Aside from these moderate minima, however, 
it appears that the visual requirements for 
jobs of the types studied are relative rather 
than absolute; it would seem then that various 
specific visual standards which require rela- 
tively adequate degrees of acuity may be ex- 
pected to produce, in the over-all, relatively 
satisfactory results in differentiating between 
high criterion and low criterion employees on 
jobs of the types investigated. 

In addition, it would seem that for most 
effective results the visual acuity standards for 


im off scores for all jobs. 
A ith regard to the possible use of standard- 
zed Profile formulas for setting visual acuity 


job-standards, it should be pointed out that 
while no such formula can be expected to 
produce the degree of differentiation among 
present employees that is possible by developing 
individually adapted profiles for specific jobs, 
nevertheless, the use of standardized profile 
formulas seems to offer a more adequate basis, 
in the long run, for predicting job performance 
on most jobs of the types investigated and for 
the referral of employees for professional 
eye care. 
Summary and Conclusions 


An investigation was made of the relation- 
ship between certain visual skills and perfor- 
mance on various types of industrial jobs, usin£ 
Ortho-Rater scores of approximately 5500 em. 
ployees on 41 “base” jobs and 51 *hold-out 
jobs, and measures of job performance which 
identified high criterion employees and low 
criterion employees. The investigation was 
made with particular attention to four ens 
ures of visual acuity, namely, far and ne 
acuity in both eyes, and far and near po 
in the worse eye. The investigation include 
two phases: first, an analysis for each of t É 
four tests of the proportion of all individua 5 
on the hold-out jobs who scored at each sc? 
level who were high criterion employees; a 
second, the development and application a 
several experimental vision test profiles and í 
analysis of their results. 

The primary conclusions are as follows: 


1. Visual acuity requirements for eat 
factory performance on jobs of the ar 
studied seem to be general and relative 12 
than specific and absolute. igual 

2. A moderate minimum of near lue: 0 
acuity in both eyes is especially pertine?’ od 
Satisfactory performance for both “near ; te 
“combination near-far” jobs, and a mode 
minimum of far visual acuity is equally ka : 
tinent to performance on "combination ? nc 
far" jobs; beyond such minima, perfor wu 
Increases quite constantly with greater ded 
of acuity, but at a more gradual rate. — joP 

3. The increase in the probability ° (pot 
Success with increases in worse-eye acuity "it 
near and far) is relatively constant over 
entire range of scores, ofl? 

- The several experimental acuity P" pa” 
(developed from data derived from the 


is 
es 


er 


An Analysis of Visual Requirements in Industry 


Jobs and applied to the hold-out jobs) gave 
Substantially similar results on the group of 
jobs to which they were individually applied 
(one group of near vision jobs and another 
group of combination near and far vision jobs). 

5. The results of practically all the profiles 
were beyond reasonable chance expectations, 
although the results were more pronounced on 
the combination near and far jobs than on 
near jobs. 

6. The addition to each of the acuity profiles 
of a standard set of fixed scores on phoria, 
depth, and color tests contributed noticeably 
to the differentiation between high criterion 
and low criterion employees on the near jobs, 
but not on the combination near and far jobs. 

7. The possible use of standardized profiles 
Shows definite promise of providing, in the 
long run, relatively satisfactory standards of 
visual skills for jobs of the types investigated. 


Received May 6, 1949. 


References 


E Carr, E. R. 4m analysis of the relationship of 
phoria, depth perception and color discrimination 


10. 


11. 


. Tiffin, J. Industrial psychology. 


61 


to job performance. Master's thesis, Purdue 
Univ., 1948. : 


. Kephart, N. C. Visual skills and labor turnover. 


J. appl. Psychol., 1948, 32, 51-55. 


. Kephart, N. C. An analysis of professional eye 


care and industrial efficiency. Trans. Amer. 
Acad. Ophthal. & Otol., March-April, 1946 
166-178. 


> 


. Stump, N. F. Spotting accident-prone workers by 


vision tests. Fact. Mgmt. & Maint., June, 1945, 


. Stump, N. F. Visual functions as related to acci- 


dent-proneness. Personnel, 1944, 21, 50-56. 
(2nd Ed.) New 
York: Prentice-Hall, Inc., 1947. 


. Tiffin, J. Vision and industrial production. ilum. 


Engng., 1945, 40, 230-257. 


. Tiffin, J. The use of visual data as an aid to in- 


crease production and efficiency. Trans. Amer. 
Acad. Ophthal. & Otol., January-February, 1944. 


. Tiffin, J., and Greenly, R. J. Employee selection 


tests for electrical fixture assemblers and radio 
assemblers. J. appl. Psychol., 1939, 23, 240- 
263. 

Tiffin, J., and Rogers, H. B. The selection and 
training of inspectors. Personnel, 1941, 18, 
14-31. 

Standard practice in the administration of the 
Bausch and Lomb occupational vision tests with 
the Ortho-Rater. Bausch and Lomb Optical 
Company, Rochester, N. Y.} 1944. 


The Effect of Ordinal Position upon Responses to 
Items in a Check List 


Donald T. Campbell and Phillip J. Mohr 
Departments of Psychology and Speech, The Ohio State University 


With the increased use of questionnaires 
and check lists has come a healthy awareness 
of the many possible sources of bias in such 
instruments. One source of bias, which this 
paper considers, is position effect. Two types 
of position effect may be noted: (1) the effect 
of ordinal position, or position per se; and (2) 
the effect of specific sequences, involving con- 
tent interaction. The effect of ordinal posi- 
tion has been mentioned in standard works on 
the questionnaire method (e.g. 2) and this 
possibility has been regarded by some pro- 
fessional survey workers to be of sufficient 
importance to necessitate systematic controls. 

This anxiety is justified by surprisingly little 
published research. In factual tests, where 
Tesponses presumably are based on knowledge, 
studies have shown significant ordinal differ- 
ences for the alternatives of a multiple choice 
item, although with somewhat inconsistent re- 
sults (1,9, 11. Mathews has similarly studied 
the effect of different left-right arrangements of 
response alternatives for the items of an in- 
terest test (12). In opinion poll studies fra- 
mentary reports indicate possible position 
effects (4; 5, pp. 34-5;7). Of these, Cantril (5) 
finds significant but inconsistent differences 
for various orderings of two or three alterna- 
tives to a single poll question. From the 
available abstracts of the two other studies, 
it would appear that specific sequence effects 
(in which content interaction might be in- 
volved) are confounded with purely ordinal 
effects (such as might be due to primacy, 
recency, fatigue, or the like). No studies have 
been located in which the effect of position 
Per se has been studied systematically, 

Our study attempts to isolate the effect of 
ordinal Position in a preference check list 
already in use in Professional work, We 
borrow from the annual Towa Radio Audience 

rector, Dr. F. L. Whan, has 


1 one who have been worried by the 
Possibility of position effect, and has emi ioi: 


62 


to control it by using in rotation multiple Íorms 
of a radio program types list. With Dr. 
Whan's permission, we have taken with slight 
modification the check list of 16 radio program 
types used in the 1946 survey. Modifications 
in procedure have been introduced only insofar 
as necessitated by group administration 
college classes. 

The check list, as we used it, had the follow- 
ing instructions: 


“Listed below are sixteen general types of 
radio program materials. In the spaces prO 
vided at the right, check the FIVE. UP 
that you like BEST. Please check ONL 
FIVE—no more, no less.” 


Half of the questionnaires listed the 3 
types with examples provided for each npa 
The examples, as in Dr. Whan’s studies, ue 
sented well-known popular programs with m 
paratively high “Hooper” ratings. The oth! 4 
half of the questionnaire gaye no examples. 
We wanted to determine whether or not 1€? 
tive ambiguity, i.e., absence of examples, WOU” 
accentuate any position effect, on the assump 
tion that if the stimuli (the program bat 
were less well defined respondents might iði 
more apt to choose with reference to posit. 
of an item on the check list rather than tO a 
content. The 16 program types are lis s 
below, with the asterisk indicating the term eh 


À : x : e 
tion of the items in the “without examP 
series: 


1. Livestock and Grain Market Reports* 

2. Talks on Farming and Farm Problems* ork 

3. News Broadcasts*, including local news, D€' 
commentators and farm news public 

4. Talks and Discussions of Important jca? 
Affairs* me 


, Such as talks by congressmen, 
Town Meeting of the Air, etc. 
- Devotional Programs*: Sermons, 
Bion, etc, of all 
- Hymns, Religious Music*, such as Hymns ^ sc. 
Churches, Salt Lake City Tabernacle Cho fet- 
sical or Semi-Classical Music*, such 25 


i- 
talks on 7 


Effect of Ordinal Position on Responses to Items 63 


Else" PRETI 


LT 
2. 
3. CEKA 
4 |D|M| 
2 5 
be cary 
of ENS 
HE [e|1]| 
39 [ofe] 
x 
z 10. [P|E]| 
St 
a 
E LIES 
13. 
14. [B|6 | 
15. [K|t| 
16. Tes | 


E" 


| 


Fic. 1. The latin square design (letters designate questionnaire forms). 


ropolitan Opera, New York Philharmonic Orches- 
tra, Andre Kostelanetz, ete. 

8. Old-Time Pioneer or Western Music*, such as 
National Barn Dance, etc. 

9. Popular Music and Popular Orchestras*, such as 
Fred Waring, Lucky Strike Hit Parade, Vaughn 


Monroe, etc. d 
10. Brass Bands*, such as the U. S. Army, Navy, or 


Marine Bands, etc. 

11. Comedians*, such as Jack Benny, Bob Hope, 
Fred Allen, etc. . 

12. Variety Programs*, without featured comedians, 


such as Breakfast Club, Arthur Godfrey’s Talent 


Scouts, etc. o» Ne 
13. Quiz or Audience Participation Programs", suc 


as Break the Bank, Quiz Kids, Vox Pop, Truth 


or Consequences, etc. i 
M. Complete Dramatic Shows*, such as Lux Radio 


Theatre, Screen Guild, etc. — 
15. Daily Continued Story Serials, such as Ma 
- Perkins, When a Girl Marries, Lum and Abner, 


etc. 
16. Broadcasts of Sports Events*, such as broadcasts 


of football, basketball, baseball games, fights, etc. 


Experimental Design. A “latin square” 
Provides the ideal design for such a study (8), 
ut to our knowledge it has not previously 
Cen used for this purpose. Through this 
€vice, we prepared sixteen check list forms, 
With each item appearing once on each form, 
and once in each ordinal position. Subject to 
se restrictions, the distribution of items in 


the latin square might well have been random. 
But Fisher (6, pp. 267-269) points out that 
“Gn a well-planned experiment certain restric- 
tions may be imposed on the random arrange- 
ment of the plots in such a way that the experi- 
mentalerror may still be accurately estimated." 
As mentioned before, this study was designed 
to measure the effect of ordinal position,— 
apart from sequence of content interaction 
effect. Therefore, rather than using a com- 
pletely random latin square, we used one in 
which the sequence was systematically con- 
trolled by having each program type preceded 
once and followed once by every other type.! 

The latin square is shown in Figure 1. The 
letters A through P identify the sixteen basic 
questionnaire forms. For example, in form A, 
Classical Music was first, Dramatic Shows 
decond, Public Affairs third, etc. Because of 
the use of the “with examples" series and the 
*twithout examples" series, a total of 32 differ- 
ent forms was actually used. 

1 vri are in: i 

de putes Laborsbory ol Gus Depst y LEN 
matics, The Ohio State University, for providing this 
“homogeneous” latin square and for doing the bulk of 
the calculations. In passing, it may be noted that 
squares of this sort are available only for even numbers 


(3). The Statistics Laboratory will be ha; a 
vide such squares to those interested. PPSIEQ DER 


64 


The Sample. The questionnaire was sub- 
mitted to 1280 students in 35 beginning classes 
in Psychology and Speech. Forty students 
filled in each form. Eight hundred and eighty- 
four of the respondents were men, 374 women, 
with 22 neglecting to indicate sex. 

Administration of the Survey. No advance 
warning was given to the students. Each in- 
structor passed out the questionnaires, with 
these instructions. "I am sure this is self- 

explanatory. Do not collaborate with your 
neighbors. Do not ask about or mention 
aloud any program or specific radio program, 
since that might influence the response of 
others." We are reasonably sure that the 
students felt that this was simply a survey of 
radio program preferences of college students. 

The 32 forms were distributed randomly in 
all classes, each respondent’s ballot being dif- 


ferent in arrangement of choices from those of 
his neighbors, 


Results 
The Analysis of Results. The results are 
presented in Figure 2, combining cases with 
They are presented 
keeping these two 


and without examples. 
graphically in Figure 3, 


Donald T. Campbell and Phillip J. Mohr 


sets of data separate. In Figure 2, the fre- 
quency of the choice of each program type is 
given for each of the 16 positions. For ex- 
ample, 40 respondents selected “Broadcasts of 
Sports Events” (No. 16) when that item ap- 
peared in the 1st position; 37, when it appeared 
in the 2nd position, etc. The maximum 
possible for any program type in any given 
position is 80. The totals at the bottom indi- 
cate the frequency of choice of the program 
types; the totals at the right, the frequency 
of choice of the ordinal positions. Note that 
the frequency of choice of each program type 
in each position is summed in the marginal 
totals at the bottom; in the totals at the right, 
the frequency of choice of each position for 
each program type is summed. In short, each 
position and each program type had equa! 
opportunity to be chosen. Note also that al 
1280 respondents had the opportunity of being 
represented in each marginal total—for pos 
tion and for program types—thus equating 
samples for all comparisons. á 
As might be expected, significant difference 
occur among the preferences for the 16 progra * 
types. "Popular Music and Popular 


L ERE 

2. 5 

3, 1 

4, 3 

5. =|5|4&]|22|2 
o 9 |iılefæļa|s] 
$ 7. 3 | 2 | 4ola] 3 
5 8. E 3 |43|20| 3 
2 9. ' | 2] 46] ia] i 
2 10 E 4 | 43 27 | = 
St, = | 2 | st} es] = 

-- 

[3 12. 2 3 43 | 19 2 

13, - 

3 niss | at | 2 | 

14. [t | | 43 [ea] = 

6! r1 

L 3 |42|2i| s 

te: Lt lv Tai (eee 9l 

TOTAL | 20 | 39 702 : 
34 
s[si 137 [s18 |375 | 36a | zie | 28 lees 
Fie.:2, 


Effect of Ordinal Position on Responses to Items 


65 


o 2 30 40 50 60 70 80 90 
"T ? T 2. T TT — U- T T T T T T T T 

1 K .— 7 |Poputar Music 

r4 p vv ood 

= E 

$ $ A one Comedians 

i we ee 

er 

d i Classical M 

3 : FIT SS program Type ——- clie 


4 | 4— Ordine! Positions — < 


Chance Volve (31.2%) 


Dramatic Shows. 
Nows 

Sports Events 
Quiz Shows 
Variaty Programe 
Public Affairs 
Brass Bands 
Hymns 


Western Music 
— Position, without examples 


Bram Position, with examples Farm Totke 

—— - Program without examples 

—.—-- Program with examples Devotional 
Serials 


dan Markete 


Fro. 3. Percentage of choice by position and by program type. 


Chestras" was most preferred; “Livestock and 
Grain Market Reports” (Farm Markets) was 
least preferred. But to the surprise of the 
experimenters no apparent position effect is 
found. 
. The graphic presentation of the material 
ìn Figure 3 supports this. This chart super- 
imposes the data for program preference and 
the data for position effect, to present visually 
the contrast in variability provided by the two 
dimensions of the data. In Figure 3 there is 
à separate presentation of data for “with ex- 
amples” and “without examples," which data 
Were presented combined in Figure 2. , Note 
the lack of any greater position effect in the 
"without examples" series. Both lines seem 
to vary at random about the expected value 
of 31.2% (400/1280). The presence of ex- 
amples does seem to have had an effect on 
© popularity of certain programs. a 
Statistical tests of significance confirm 1n 
Senera] the absence of position effect. The 
Marginal totals for the ordinal positions m 


Figure 2 were tested collectively by Chi Square 
against the expected value of 400 which would 
obtain if position had no effect. The obtained 
Chi Square of 12.4 had a P-value of .68, indi- 
cating that values as large or larger than 12.4 
could occur by chance 68 times out of 100. The 
hypothesis that differences among the totals 
are but chance deviations from the expected 
value cannot be rejected by this test. Similar 
tests were made of the position effect for each 
program type separately (each column in 
Figure 2) and none of these Chi Squares reached 
a 5% level of significance. If any position 
effect is present, it is not sufficiently strong to 
manifest itself with this number of cases, and 
as tested by Chi Square.? 
2 Fi - igni i 

wag ud bo ecqpuation oo tn hy ee ee 
statistics have not been applied. In passing it may be 
noted that latin squares and simultaneous use of several 
classification criteria do not lend themselves exclusively 
to analysis of variance. The present data bein, 

enumeration data, do not meet the basic assumptions of 


analysis of variance. While an arc-sine transformati 
of the data might have partially overcome this mee 


66 Donald T. Campbell and Phillip J. Mohr 


Our original expectations of a position 
effect involved, of course, some notion of frend. 
More specifically, we had expected the first 
positions to receive more choices than other 
positions, with a downward trend from first to 
last positions. Visual inspection of Figures 2 
and 3 offers little support for this notion. 
Note that, for the combined data, position 2 
is highest and position 4 is lowest; position 10 
one of the highest, and position 12 one of the 
lowest. However, a more complete answer to 
our initial notion of trend was obtained from an 
application of Mann’s “T” test. This non- 
parametric statistic of trend seemed especially 
applicable for our purposes in that it measures 
upward or downward trends in data that are 
enumerative in character, without the require- 
ment of equality of intervals between steps, 
and making no assumptions of normalcy. The 
application of this test to our data indicates 
that, at a very marginal level of significance 
(P=.048), the totals at the right in Figure 2 
tend to decrease with an increase in ordinal 
position number. This level of significance 
certainly is not very high, particularly when 
the relatively large number of cases is con- 
sidered. And, as noted above, frequent ex- 
ceptions occur to this trend, if it is a trend. 

A further source of evidence on the presence 
or absence of a position effect is found in the 
comparison of the position totals from different 
samples. If position were to have a highly 
complicated and non-linear effect, it should 
none the less be consistent from sample to 
Sample. With this in mind, the correlation 
coefficients among position totals for four sub- 
Samples were computed, as shown below. 

(A)" represents the series of questionnaires 


Which provided examples for each program 
type; “(B)” 


> the series which gave no examples: 


Qi populations, wi 
sification criteria, T, 
the 5% level of significa; 


Product Moment Correlation of: 


i a raca (A) with Men....... (B)is .21 
Women....(A) with Women... . (B) is —.05 
Mieni... 2c seve (A) with Women....(B)is .14 
Women....(A) with Men....... (B)is .26 
Men....... (A) with Women... . (A) is —.05 
Men....... (B) with Women....(B)is .13 


These correlations describe little, if any, 
similarity in the obtained position totals from 
one sample to another. Note, however, that 
the general tendency is for the values to be 
positive, with two exceptions. Once again, 
there is no clear evidence for a position effect, 
but only some slight indications that one may 
be present. : 

The present data seem to indicate little if 
any effect of ordinal position upon the number 
of choices received by the items in the check 
list. It is to be emphasized that the conditions 
under which these results have been obtained 
are highly unique and specific. We would 
suggest no generalizations of these results, ob- 
tained on college students, to the legendary 
housewife whose soup is boiling over on the 
Stove during the interview. Similarly, our Te 
sults may be specific to the longish 16 item 
check list, or to the strict assignment of five 
choices, or the content of the items. Further 
research is obviously needed. Each comme! 
cial user of the check list may want to deter 
mine the presence or absence of a position 
effect under his own operating conditions. w 
do believe, however, that through the use © 
designs such as the one presented in this study; 
the definitive answer for the specific situatio" 
can be obtained with the relatively small efor 


of preparing systematically multiple forms ? 
the check list. 


Received April 18, 1949. 


References 


1. Atwell, E. R., and Wells, F. L. Wide range a 
tiple choice vocabulary test. J. appl. Psyol 
1937, 21, 550-555. che 

2. Blankenship, A. p. Consumer and opinion rese" 
New York: Harper and Brothers, 1943. . — of 

3. Bugelski, B. R. A note on Grant's discussio? 3 
the latin square principle in the design 


1 pol- 
analysis of Psychological experiments. P: y^ 
Bull., 1949, 46, 49-50, 


Effect of Ordinal Position on Responses to Items 67 


4. Cahalan, D., and Tamulonis, V. M. The effect of 9. McNamara, W. J., and Weitzman, E. The effect 
question variations in public opinion surveys. of choice placement on the difficulty of multiple- 
" Amer. Psychol., 1947, 2, 328 (Abstract). choice questions. J. educ. Psychol., 1945, 36, 
5. Cantiil, Hadley. Gauging public opinion. Prince- 103-113. à 
_ ton: Princeton University Press, 1944, 34-35. 10. Mann, Henry B. Non-parametric tests against 
6. Fisher, R. A. Statistical methods for research work- trend. Econometrica, 1945, 13, 245-259. 
ji ue MN po eee d sel EX ;; 11. Mathews, C.O. The effect of position of children's 
ment in ‘validity. Amer. Psychol., 1947, 2, 328 answers to questions Am. bwosrespanse ‘yp ESLSE 
(Abstract). tests. J. educ. Psychol., 1927, 18, 445-457. 
8. Grant, D. A. The latin square principle in the 12. Mathews, C. O. The effect of the order of printed 


design and analysis of psychological experiments. response words on an interest questionnaire. 
Psychol. Bull., 1948, 45, 427, 442. J. educ. Psychol., 1929, 20, 128- 134. 


Identification of Cola Beverages. 


IV. Postscript 


N. H. Pronko and D. T. Herman 


University of Wichita 


Three eariler studies! on the identification 
of cola beverages have led to the present in- 
vestigation. The first study employed Coca 
Cola, Pepsi Cola, RC Cola, and Vess Cola. 
Administered to 108 Ss, results showed an 
almost equal distribution of identification re- 
Sponses among the first three categories. 
Since S's refused to use the fourth name, it was 
decided to use only the first three drinks in a 
second study on the hypothesis that identifi- 
cation responses would be more nearly equally 
distributed among the three beverage names. 
The hypothesis was substantiated as was also 
a subsequent one in Study No. 3 which em- 
ployed three unknown or less well-known colas 
but which showed an almost chance distri- 
bution of the Coca Cola, Pepsi Cola and RC 
Cola names to the three “dark horses." On 
the assumption that subjects might do better 
if they were told to identify Coca Cola, Pepsi 
Cola and RC Cola when these beverages were 
actually administered and when the subjects 
were told that these were the Colas they were 
to identify, the present study was undertaken. 


Procedure 


As in the previous Studies, two groups of 
subjects were used—105 Ss in Part Iand 60 in 
Part IL. These were beginning students in 
Elementary Psychology courses. 


Part I. Each of 105 Ss was admitted 
individually into the experimental room and 
Was asked to sit down, after which the following 
Instructions were read to him: 


"We would like to have you taste and identify some 
Cola drinks, We will have three colas: Pepsi Cola, 
Coca Cola and RC Cola. You will be told in what 
order and when you are to drink them. After you 
have finished each sample report your identification 

After each stimulus Presentation, take 


enough water from the paper cup before you to 
rinse your mouth well,” 


; and B ifica- 
tion of Cola beverages. P erap Jr. Mentitica 


A tray containing three one-oz. glasses of 
Coca Cola, Pepsi Cola and RC Cola respec- 
tively was placed before the S. He was then 
instructed in what order to drink the beverages 
labelled X, Y and Z. Samplings were spaced 
about a minute apart during which interval 
pertinent data were recorded. 

The order in which the three beverages were 
presented was determined pre-experimentally 
and was such that each beverage was admin- 
istered in the first, second and third positions 
an equal number of times (i.e, 35 times). 
This counterbalanced order was intended to 
nullify position effects or taste interactions 
orally. All beverages were kept out of sight of 
Ssand were refrigerated at approximately 5? C. 

Part II. In Part II, 60 Ss were admin- 
istered the same Cola drink at each of wei 
trials. "Thus, 20 got all Coca Cola; 20, a 
Pepsi Cola; and 20, RC Cola. In all other 
respects, including the instructions, the pro- 
cedure was the same as that Íor Part I. 


Results 


Results show that of 105 Coca Cola samples; 
57 are correctly identified but 48 are m 
identified. In the case of the 105 Pepsi Cola 
samples 45 are correctly labelled and 60 mis- 
identified. This is similar to the case of R 
Cola which is correctly identified 47 times an 
misidentified 58 times. In all three case? 
these colas are misidentified almost as often 25 
they are correctly identified. : 

In Part II (where each of 20 Ss was c 
three samples of the same cola) there was 
frequency variation of from 17 to 23 correc 
responses, as 

As to the percentage of correct respon? á 
When Ss were given different colas, for the ms 
Cola samples, there were 54% correct ident! 
cations, 43% for Pepsi Cola and 45% for la 
Cola. As in the previous study, Coca C enl 
I5 In the lead as regards correct identificatio? 
For all three brands 34% are correctly calle 
and 66% incorrectly, 


Identification of Cola Beverages. 


A comparison of critical ratio tests of the 
hypothesis that the various identification re- 
Sponses are not on the basis of actual taste 
Stimuli was then made. The correct identifi- 
cations of the three respective Colas do not 
Show the same results. Those for Pepsi Cola 
and RC Cola are low, actually 1.44 and 1.71 
respectively. However, the CR for the Coca 
Cola category is 3.19 which indicates a statis- 
tically significant difference between the ob- 
tained frequency and chance expectancy. 
Furthermore, analysis of data based on fre- 
quency of correct responses shows very low 
ratios ranging from 0 to 1.00, indicating that 
in no instance does any obtained frequency 
vary significantly from chance expectancy. 

Next the percentages of correct identifica- 
tions of Parts I and II were compared. In 
the former case three different Colas were 
given while in the latter three identical samples 
were administered. In the case of Pepsi Cola 
and RC Cola, the tests of a statistically signifi- 
cant difference between Parts I and II yield the 
low critical ratios of .76 and .89 respectively. 
However, the CR obtained for a difference 
between Coca Cola identified correctly in Parts 
Land II is 3.44, a statistically significant differ- 
ence which suggests a behavioral difference 
between these parts. Apparently, as a group, 
our subjects do not discriminate in the same 
Way in the two situations. There is a differ- 
ence in the number of correct discriminations 
When they are given three samples of the same 
Cola as compared with administration of three 
different Colas. 

Significance tests of a comparison of Study 
I Where correct names were not given in the 
instructions to S and the present study in 
Which they were indicated, show that when 
the frequency of correct identifications for the 
two studies is compared, no reliable differences 
are found. However, this does not mean that 
Coca Cola is not being identified with better 
than chance frequency as found in the present 
Study, 

A similar comparison for Parts II of Study 
IT and the present study yields CR's for differ- 
ences in Coca Cola, Pepsi Cola and RC Cola 


IV. Postscript 69 


correct identifications that are not reliably 
different. Those for the incorrect identifica- 
tions show high variability. 

The overall results from this study suggest 
that when Ss are placed in the comparatively 
restricted situation of the present experiment, 
their identifications are no better than in Part 
II of the previous studies (i.e. when they are 
administered three identical samples of a cola 
beverage). Their identification responses are 
not statistically different from chance ex- 
pectancy. The same holds for identification 
of the Pepsi Cola and RC Cola drinks. The 
only shift concerns the Coca Cola beverage 
which is identified with greater than chance 
frequency in this situation where choice of 
correct name is experimentally limited to the 
three names given the subject in the instruc- 
tions. Narrowing his choice apparently per- 
mits him to make more strikes, although even 
in this situation he misidentifies Coca Cola 
almost as often as he identifies it. 


Summary 


A total group of 165 Ss was asked to 
identify one-oz. samples of the following three 
Cola Beverages: Coca Cola, Pepsi Cola and 
Royal Crown (RC) Cola. In Part I, 105 Ss 
were presented one of each of three different 
Colas while in Part II, 60 Ss were given three 
samples of the same beverage being evenly 
divided among the three different classes. In- 
structions to both groups stated the beverages 
to be the three named above. 

In general, when Ss are given three samples 
of the same beverage their identifications are 
not significantly different than when the bever- 
ages are unknown Colas or actually these Colas 
but unspecified. The same holds for Pepsi 
Cola and RC Cola identifications when three 
different samples are given. The situation 
with Coca Cola is different, for in this study 
this beverage is identified with a frequency that 
yields a statistically significant difference from 
chance expectancy. 


Received October 21, 1949. 
Early publication. 


Book Reviews 


Reynolds, Lloyd G., and Shister, Joseph. Job 
horizons. New York: Harper and Brothers, 
1949. Pp.x4-102. $2.25. 


This is a descriptive report to lay readers 
issued by the Labor and Management Center 
of Yale University. It is concerned with 
causes of labor mobility or immobility as 
given to fifteen interviewers by 800 manual 
workers in a New England manufacturing city 
in 1947. 

The first two chapters cover the general 
problem and importance of mobility and the 
factors in job satisfaction. The factors are 
defined and discussed in an interesting manner 
and include numerous quotations from manual 
workers. The next two chapters relate to how 
workers go about finding new jobs, what causes 
them to pick one job rather than another, why 
they left school, and how they went about 
getting their first job. Few persons will fail 
to profit from a careful reading of these chap- 
ters, and they are of special value to vocational 
counselors. Emphasis is placed on the non- 
rational procedure typically followed by ap- 
plicants, their lack of knowledge of the labor 
market, and inappropriateness of current eco- 
nomic theory. The last two chapters cover 
movement up the occupational ladder (which 
is obviously limited because the study is re- 
- stricted to those in manual occupations) and 

the worker's view of job opportunity. This 
last chapter draws the other chapters together 
and points out that the worker’s behavior is 
nonrational or irrational from the viewpoint of 
economic theory rather than the circumstances 
as the worker sees them. 

The book is concluded with a brief ap- 
pendix on methods including sampling. Many 
readers will qeustion the use of samples ran- 
domly selected from manual workers listed in 
City directories. As the authors have pointed 


out, many of the more mobile workers have 
been eliminated. 

As a research report, even for the general 
public, this book is deficient in several ways. 
Unwarranted generalizations and conclusions 


seem to have been made, and the book has no 


index or bibliography. Many of the findings 


which appear revelationary to the economist- 
authors will be considered commonplace by 
psychologists. No reference is made to other 
work in the field of worker motivation, job 
satisfaction, or job preferences. Many of these 
other researches, apparently conducted with 
care equal to this one, conflict with some of the 
findings reported here. Neither such conflicts 
(nor the agreements) are mentioned. This is 
unfortunate for the lay as well as technically 
trained reader. The book frequently refers to 
"workers." It is to be regretted that the more 
restricted “manual workers" was not invari- 
ably used to prevent the reader from over 
generalization. 

Nevertheless, this small book is fascinating 
reading, and is packed with interesting facts 
and views. Tew readers will finish the book 
without mental stimulation and new ideas. 


C. E. Jurgensen 


Minneapolis Gas Company. 


Tobias Wagner. Selective job placement. New 
York: National Conservation Bureau, AS 


sociation of Casualty and Surety Execu- 
tives, 1946. Pp. 151. 


The author of this little volume has com- 
piled a number of tables and charts from the 
comparative study of the work efficiency of 4 
group of normal workers and a group. e 
workers with various types of disabilities: 
These comparisons are presented in the form 9 
averages, e.g., average number of days absent 
per year, average rate of production, average 
rated quality of production, etc. Much of the 
potential value of the book as a source of infor 
mation on the industrial usefulness of the dis- 
abled is lost because the author has failed to 
present two very important types of statistica 
information. At no point in the book doe 
the author state the number of cases of either 
normal or disabled workers concerned in th 
research. Despite the bristling appearance ° 
the book in terms of the statistical data PJ€ 
sented, no indications of the reliability of the 
averages nor any test of the significance differ" 


ences of the averages is ever presented. Thes? 
70 


Book Reviews n 


are serious shortcomings in any research work, 
and the omission is difficult to understand. 

In general, the approach to the topic of 

worker placement, whether disabled or non- 
disabled, is rather elementary. If Dr. Wagner 
was writing this book for the lay person, his 
looseness of description and generality of state- 
ment might be forgiven, but he speaks of 
directing this book toward “personnel man- 
agers, safety engineers, rehabilitation special- 
ists, and employers in general.” 
_ The general impression given by the author 
is that he is selling the merits of disabled 
workers to the industrial employer. However, 
it does appear that the disabled are at least as 
efficient in industrial jobs where their disability 
Is not a handicap, as the normals in the 
same jobs. 

The book might well be used in industrial 
Concerns where attempts are being made to 
get management to accept disabled people for 
employment. It could be used as an argument 
for the disabled. It does not, however, seem 
to have any place in the college curriculum as a 
text, since the contribution of the book is 
rather limited. 

The somewhat repetitious style of the 
author makes continued reading a little diffi- 
Cult, as the same statements or ideas are pre- 
Sented at the beginning and the end of consecu- 
tive chapters. The frequent use of such terms 
as always, every, entirely, absolutely, impos- 
Sible, constantly, absolutely necessary, all, and 
completely, are often used ill-advisedly. 

The book has merit in bringing together 
in one volume the research conducted by Dr. 
Wagner, and despite its brevity, it covers a 
Sreat deal of ground. The section on the types 
of physical disabilities reads like a medical 
dictionary, and covers the orthopedic, visual, 
and hearing disabilities. Despite the inade- 
quacy of the reported statistics, this book 
Would serve as a useful reference on this 


Specific topic. 
j A. A. Canfield 
University of Southern California 

riting. 


Flesch, Rudolf The art of readable wi 


New York: Harper and Brothers, 
Pp.237. $3.00. 


"To come right out with i å 
author, “this is a book on rhetoric. 


t," writes the 
Its pur- 


pose is to help you in writing." With this in- 
troduction Dr. Flesch attempts to present a 
"modern, scientific rhetoric" for informal, use- 
ful, every-day writing. While this reviewer 
would be presumptive to evaluate the rhetoric 
per se, it is his opinion that this book is 
an excellent contribution to more effective 
communication. 

The Art of Readable Writing takes a wider 
view of the writing process than its predecessor, 
The Art of Plain Talk. While the first book 
was primarily concerned with the readability of 
sentences and words, in the second book the 
whole technique of writing readably is surveyed. 

In chronological sequence Flesch takes up 
the problems which a writer faces (or should 
face) in writing. In clear short chapters, 
packed with well-chosen examples to illustrate 
his meanings, he follows the writing effort from 
the initial idea to the final draft. How-to- 
start is dealt with by concrete suggestions for 
audience appraisal, collecting facts, assimila- 
ting data and getting a “slant” or theme. 
What-to-say is covered by recommendations 
for meaningful introductions, narrative style 
and personalizing the message. How-to-say- 
it is given with instructions on writing col- 
loquially, shaking off some misconceptions 
about form and content, and achieving the 
correct level of difficulty and interest. 

In addition Flesch warns his readers about 
misunderstandings arising from ambiguous 
words and reader errors. He includes an 
especially timely caution concerning the use 
of diagrams and pictures to convey information 
without adequate written explanation. 

An appendix explains the use of the revised 
readability yardsticks essentially as given in 
J. appl. Psychol., 1948, 32, 221-233. Nomo- 
graphs for computing reading ease and human 
interest scores are printed on the end papers 
of the book. A good, annotated reading list 
for more specific help in writing and a complete 
set of notes on the illustrative and scientific 
references for each chapter round out the book, 

The presentation is skillful, readable and 
entertaining throughout. While the work may 
not be as “scientific” as Flesch implies it is, it 
is certainly an excellent combination of current 
knowledge with the shrewd insight and “know- 
how” of a readability expert who can really 
write. 


72 Book Reviews 


This book is one which should be read and 
kept on the desk of everyone who writes to be 
understood. It has implications for all areas 
of writing: letters, news stories, textbooks, 
handbooks, advertisments or union contracts. 
It is a book of high interest and real importance 
to communicating people whether they be in 
academic or applied fields. 


James J. Jenkins 
University of Minnesota 


Williamson, E. G. (Editor). Trends in student 
personnel work. Minneapolis: The Uni- 
versity of Minnesota Press, 1949. Pp. x+ 
417. $5.00. 


Too often the papers presented at a con- 
ference commemorating some individual or in- 
stitution sound like the minutes of a mutual 
admiration society when later published. This 
volume of forty-three papers by forty authors, 
although originally given in November, 1947 
at the University of Minnesota to celebrate 25 
years of personnel work there, is a notable ex- 
ception for several reasons: (1) as the title 
indicates, the authors were as much concerned 
with the future steps to be taken as with the 
accomplishments of the past; (2) the breadth of 
the Minnesota program as reviewed insured a 
comprehensive treatment of the total field; 
and (3) the long period of leadership in the 
field made the event of general as well as local 
interest. 

Following an appropriate introduction, the 
papers deal successively with the development 
of student personnel work in all its major 
aspects. Part I. The Role of Personnel Work 
sets the tone of the conference by showing, in 
articles by Willey, Cowley, MacLean, and Mc- 
Clusky, how the personnel point-of-view arose 
and its necessity for constructing a sound 
educational program for American youth. 
Following a survey (Part II) of aptitude and 
interest and personality testing by Stuit and 
Darley, respectively, in Part III, Paterson 
demonstrates vocational counseling utilizing 
the best of available case-study and measure- 
ment techniques and Shartle discusses areas of 
research necessary to understand occupational 
adjustment. 

Part IV, by Bordin and Porter, helps the 
reader to understand the directive versus non- 


direclive controversy in the light of the basic 
issues and long-term trends involved. Subse- 
quent parts deal with the more specialized 
areas of student personnel work including 
mental hygiene, problems of veterans, women 
students, and foreign students, marriage coun- 
seling, discipline, speech, financial problems, 
medical services, housing. The problems of 
selection, training and utilizing faculty coun- 
selors are given extensive treatment as 1S 
religious counseling, as viewed by representa- 
tives of the major faiths. In an integrated 
series of papers, Lloyd-Jones, Wrenn, and 
Darley trace the history of personnel work as 
a profession and present a realistic appraisal 
of its present stage of development. Although 
emphasized throughout, the importance of the 
social and cultural influences of the campus 
upon the student's total growth is pointed up 
in papers by Cowley and Sutherland. In the 
concluding Part XIV, Turner and Tyler discuss 
recent developments in achievement testing 
and their effect upon college admission policies 
and curriculum construction. References at 
the end of each paper and a detail index of 
the volume increase its value as a source-book- 

This publication should have a wide reader- 
ship. Administrators will find it a powerful 
stimulus to the development of student pet 
sonnel programs and faculty will be given a? 
understanding of the rationale and functions 
of student personnel work. Personnel workers 
will find many suggestions for improving their 
own programs and personal effectiveness 4” 
students training for personnel work will fin 
it a useful reference book and source of ideas 
for significant research. Although not de- 
signed as a textbook in the ordinary sense, a 
can readily serve as the basic source-book f0 
a graduate seminar in student personne 
administration. 


Albert S. Thompson 


Teachers College, 
Columbia University 


Brouwer, Paul J. Student personnel services = 
general education. Washington, D- 9. 
American Council on Education, ind 
Pp. 317. $3.50. 


_ Student Personnel Services in General Educ 
lion, covers the period from 1939 to 1944- 
includes a. description of participation by s 


Book Reviews 73 


original group of 22 colleges and universities 
minus 7 drop-outs, plus three late additions. 
It is a bewildering book. Basically, it is a 
report of the findings about student personnel 
Services offered by an extremely heterogeneous 
group of colleges sampled to include “the land- 
Brant college, the municipal university, the 
State teachers college, the independent liberal 
arts college, the Catholic college, the Protes- 
tant church-related college, the Negro college, 
the four-year college for women, the junior 
college for women, and coeducational junior 
College." Some of its best parts are those 
Which reflect the thinking of a particular 
author, rather than the agreements of the 
&roup which participated in the program. 
_ The heterogeneity of participating institu- 
tions, plus individual viewpoints in the writing 
of Various sections, contribute to uneveness 
n writing, and some lack of coherence. Un- 
fortunately, there is neither a subject-matter 
Nor author index. These omissions offer the 
usual handicap which drives the interested but 
baffled reader to frenzy. . 
The reviewer found two sections of partic- 
ular interest, These are Section 1, The coun- 
Seling service (pp. 7-46) and Section 5, Pre- and 
Posi-college personnel services (pp. 102-114). 
he first has been written quite recently, be- 
Cause it contains some excellent writing cover- 
Ing the extremes of the continuum of coun- 
Sling methodology. Some of the materials 
Would not have been available if they had been 
Vritten at the time the program was in prog- 
Tess. The author almost makes a distinction 
between counseling and advising. Had this 
Istinction been clearly made, it would have 
“One much to clarify confusion in this regard 
M the minds of many college administrators 
and College instructors. The term counseling 
€Ncountered as a process carried on by stu- 
ents. Those readers who struggled through 
Braduate courses and degrees to reach the first 


rung on the professional counseling ladder 
must be excused for the slight shudder. 

Section 5, Pre- and post-college personnel 
services, is weak in contrast to Section 1. A 
real question is whether the legitimate ob- 
jective of friendliness and rapport is not suffi- 
cient to be gained from these methods without 
making the questionable claim that we know 
the student sufficiently well through casual 
interview and unverified information to help 
him materially in his planning. 

Part III, The principles of personnel services 
(Sister Annette) is, in the opinion of the re- 
viewer, the heart of the book for those who are 
interested in counseling as the foundation of 
student personnel work. Sister Annette’s rec- 
ognition of differences in the basic individual 
personality structures of teachers and its effect 
on what they can do as personnel workers is 
refreshing (p. 242). 

General criticisms of the book include fail- 
ure to give adequate attention to biological 
heredity, to special aptitudes and abilities, to 
mental organization as a basis for curriculum 
building, to trait variability, and to multiple 
curricula. Despite these shortcomings, parts 
of the book are valuable to-an extent that 
causes the reviewer to recommend it for pur- 
chase. It is hoped that the policies of the 
American Council on Education will encourage 
more specificity in its publications treating 
student personnel work than has been the case 
in this publication. If some of the topics in 
this book had been treated in terms of insti- 
tutional enrollment differentials and privately 
versus publicly supported colleges, the publica- 
tion might have been far more valuable to a 
larger group of readers. And, finally, can’t 
we have a law which requires an index for all 
books? 

Milton E. Hahn 


University of California, Los Angeles 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 
; Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Pavlov, a biography. B.P. Babkin. Chicago: 
University of Chicago Press, 1949. Pp. 
470. $6.00. 

Interaction process analysis. Robert F. Bales. 
Cambridge: Addison-Wesley Press, Inc., 
1949. Pp. 224. $6.00. 

Workbook in personnel methods. Robert M. 
Bellows and Carl H. Rush, Jr. Dubuque: 
Wm. C. Brown Co., 1949. Pp.102. $2.10. 

Conference methods in industry. Henry M. 
Busch. New York: Harper and Brothers, 
1949. Pp. 107. 

Applied experimental psychology. Alphonse 
Chapanis, Wendell R. Garner, and Clifford 
T. Morgan. New York: John Wiley and 
Sons, Inc., 1949. Pp. 434. $4.50. 

In the name of common sense. Revised edition. 
Matthew N. Chappell. New York: The 
Macmillan Co., 1949. Pp. 172. $2.75. 

Essentials of psychological testing. Lee J. Cron- 
bach. New York: Harper and Brothers, 
1949. Pp. 475. $4.50. 

Current trends in industrial psychology. Wayne 
Dennis et al. Pittsburgh: University of 
Pittsburgh Press, 1949. Pp. 198. $3.75. 

Courts on trial. Jerome Frank. Princeton: 
Princeton University Press, 1949. Pp. 441. 
$5.00. 

Psychoanalysis. Second edition. Edward 
Glover. New York: Staples Press, Inc., 
1949. Pp. 367. $4.00. 

The prediction of categories from measurements: 
with applications to personnel selection and 
clinical prognosis. J. P. Guilford and 
William B. Michael. Beverly Hills: Sheri- 
dan Supply Co., 1949. Pp. 55. $1.40. 

Organization of behavior. D. O. Hebb. New 
York: John Wiley and Sons, Inc., 1949. 
Pp. 335. $4.00. 

Trends in industrial relations. Bulletin No. 16. 
Alexander R. Heron et al. Pasadena: 
Industrial Relations Section, California In- 
stitute of Technology, 1949, Pp. 88. 
$1.00. 

Bristow Rogers: American Negro. Else P. 
Hillpern, Irving A. Spaulding, and Edmund 


P.Hillpern. New York: Hermitage House, 
Inc., 1949. Pp. 184. $3.00. 

Leadership and isolation. Helen Hall Jennings- 
New York: Longmans, Green, and Co., Inc., 
1949. Pp. 240. $3.00. 

Therapeutic group work with children. Gisela 
Konopka. Minneapolis: University of 
Minnesota Press, 1949. Pp. 134. $2.50. 

The magic cloak. James Clark Moloney- 
Wakefield, Mass.: The Montrose Press, 
1949. Pp. 345. $5.00. 

Introduction to psychopathology. Lawrence I. 
O'Kelly. New York: Prentice-Hall, Inc 
1949. Pp. 736. $4.50. a 

Industrial and occupational trends in nation i 
employment. 1910-1940, 1910-1948. RA 
search Report No. 11. Gladys L. Palmer 
and Ann Ratner. Philadelphia: Indus 
trial Research Department, University ° 
Pennsylvania, 1949. Pp. 80. $1.00. d 

Mass communications. Wilbur Schramm, P d 
itor. Urbana: University of Illinois Pres» 
1949. Pp.552. $4.50. J 

Varieties of delinquent youth. William H. d 
don. New York: Harper and Brother® 
1949. Pp.899. $8.00. E 

Predicting success in professional schoo A 
Dewey B. Stuit, Gwendolen S. Dicks 
Thomas P. Jordan, and Lester Schlo" 
Washington, D. C.: American Council 
Education, 1949. Pp. 187. $3.00. 

Foundations of method for secondary schools: 
N. Thut and J. Raymond Gerberich. 
York: McGraw-Hill Book Co., Inc., 
Pp.493. $4.00. v. 

Personality maladjustments and mental hyg! 
Second Edition. J. E. Wallace W2 
New York: McGraw-Hill Book Co» 
1949. Pp. 581. $5.00. wat 

Hypnotherapy of war neuroses. John G. Co» 
kins. New York: The Ronald Press 
1949. Pp. 384. $5.00. p 

Sight, light and efficiency. H. C. we 
London: H. K. Lewis and Co., Ltd.; 
Pp. 318. 42s. net. 


I 
ew 
1949 


C 


New Books, Monographs, and Pamphlets 75 


Theory of hearing. Ernest Glen Wever. New Identifying and developing polenlial leaders. 
York: John Wiley and Sons, Inc., 1949. Personnel Series No. 127. New York: 


Pp. 484. $6.00. American Management Association, 1949, 
Children's voluntary reading as an expression of Pp.39. $75. 


individuality. Mary Hayden Bowen Woll- The Harvard list of books in psychology. Cam- 
ner. New York: Bureau of Publications, 


, E dpa bridge: Harvard University Press, 1949. 
Teachers College, Columbia University, Pp.77. $1.00. 
1949. Pp. 117. $2.35. i oo 


Subscription Lists of the l 
American Psychological Association 


MEMBERS AND AFFILIATES 
Approximately 9,300 names 


The American Psychological Association main- 
tains an address list of its members and affiliates, 
which is for sale providing the nature of its use is in 
conformity with the purposes of the Association. 


1950 Prices 
Envelopes addressed 
(advertiser furnishes envelopes and pays express charges) 
Addresses on tape, not gummed 
(suitable for a mailing machine) 


STATE LISTS 


Priced according to number of names wanted 


SUPPLEMENTARY LISTS 


Approximately 3,500 names in total list 
Individual journal lists vary from 400 to 1,600 


The Association also maintains a list of subscribers 
who are not members of the Associatiòn (universi- 
ties, libraries, industrial laboratories, hospitals, other 
types of institutions, and individual subscribers). 
The general list for all journals includes all types. 
Each single journal has a more specialized circulation. 


For any one journal, envelopes addressed .... 
For any one journal, addresses on tape 

For all journals, envelopes addressed 

For all journals, addresses on tape 


For further information, write to 


American Psychological Association 
1515 Massachusetts Avenue Northwest 
Washington 5, D. C. 


Journal of Applied Psychology 


Vor. 34, No. 2 


APRIL, 1950 


A Difficult New Test of Mechanical Comprehension * 


William A. Owens, Jr. 
Iowa State College 


The Bennett Test of Mechanical Compre- 
hension is undoubtedly the prototype among 
Schematic, pictorial tests which place a pre- 
mium upon mechanical “common sense" and 
Insight into basic, physical principles. In a 
Tecent Survey of V.A. Guidance Centers 
throughout the country it was reported in use 
in 90% of them, and in frequent use in 64% (1). 

Since it is so generally employed, it is only 
Natural that it should occasionally be utilized 
with Populations differing rather markedly 
Tom that for which it was originally intended. 

nder these circumstances maximally satis- 
factory results can hardly be expected. 

The writer has for some time been associated 
With a collegiate institution at which academic 
Mortality in the Engineering Division is heavy, 
and in which there has been correspondingly 
8teat need for improved tools and techniques 
"Selection. In early 1946 it became apparent 
that a revised and extended Bennett-type test, 
More Schematic and more difficult than Form 

> Might be a real asset in this context. 

uch a test has been constructed and will be 

fone and incidentally described in what 
ollows, 


Problem 
. The central problem of the present investiga- 
Rn 18 to evaluate this new Test of Mechanical 
S leiprehension, Form CC, in the potential 
“ection of engineering students. 


Materials 


Pistte 4 contains typical content from the 
Ew test, It may be observed that there are 


bre, sponse possibilities, instead of the three 
“viously employed, and that the probability 
* 


i s Wert 
for ste Writer is deeply indebted to Dr. James Werl 


Atistica] assistance and advice. 


j 


KXK KG 


60. Which sprinkler will spin the fastest? 
(1) A, (2) B, (3) C, (4) D, (5) E. 


61. With the drive D as indicated, which picture 
correctly shows how the gears would turn? 
(1) A, (2) B, (3) C, (4) D, (5) E. 


jjj 


| 
i 


62. Which picture shows a physical impossibility? 
(1) A, (2) B, (3) C, (4) D, (5) E. 


Fic. 1. Typical Content, Mechanical Comprehension 
Test, Form CC. 


of chance success is thereby reduced. It may 
also be noted that the drawings are quite 
schematic, unembellished line drawings which 
assume a reasonable level of sophistication. 
Like Form BB, Form CC is an untimed 
power test. 

Subjects 


The subjects upon whose scores the suc- 
ceeding results are based were all students in 
The Engineering Division at The Iowa State 
College. Freshman, Sophomore, and Senior 
classes are represented at various points, and 
the total number of cases tested is 725, 


Methods 


The data obtained were accorded the follow- 
ing types of treatments. 1. Test reliability, 


18 William A. 
within this relatively homogeneous group, was 
determined; and Mechanical Comprehension 
Test (M.C.T.) scores were correlated with 
scores on the A.C.E. Psychological Examina- 
tion to estimate the relationship of the former 
to intelligence. 2. Hierarchies of scores made 
by groups presumably differing in the ability 
in question were tabulated. 3. Correlations 
of M.C.T. scores with certain selected course 
grades were determined. 4. A random sample 
of 321 freshmen tested in 1947 has now been 
followed longitudinally for a period of three 
years to determine the relationship of M.C.T. 
score to academic mortality. 5. Biserial cor- 
relations were computed between test score 
and “A” achievement in selected courses vs. 
the balance of the achievement distribution; 
and, similarly, between test score and “F” 
failure to achieve vs. the balance of the achieve- 
ment distribution. An estimate of the effi- 
ciency of test prediction at the high and low 
ends of the achievement continuum has thus 
been obtained. 6. Multiple correlations were 
computed between the combined effects of 
A.C.E. score, high school average, and M.C.T. 
score and the criterion of selected course grades. 
Several estimates of the independent contri- 
bution of M.C.T. score have been obtained. 
7. The importance of certain background 
factors, such as farm rearing and physics 
training, in the determination of M.C.T. score 
has been estimated, through application of 
analysis of variance and covariance techniques. 
8. M.C.T. scores have been correlated with 
scores on the Kuder Preference Record in an 


Owens, Jr. 


Table 1 


M.C.T. Score and Educational Status 


\\ 


Inter 


Quartile 3 

Group Mean Range Range N 

Agric. E. Freshmen 36 30-41 177 
Engineering Freshmen — 39 aser 321 
Archit. E. Sophomores 42 35-46 120 
Engineering Seniors 47 42-31 107 
E aas idc | Rz -—- 


attempt to uncover relevant factors of interest 
and motivation. 


Results 


'The primary results of this study have been 
summarized in Tables 1 through 7. Prior t? 
more detailed comments upon these it may be 
noted that the corrected odd-even reliability 
of the new M.C.T. in a population of 321 ens” 
neering freshmen was 0.80, and that scores on 
it correlated 0.39 with scores on the ACE. 

Table 1, then, shows a hierarchy of M.C.- 
scores which accords with expectation. © i 
apparent that educational attrition has elim 
nated many of the low scoring individuals: k 
is also suggested that the test is mainly is 
aptitude test in that, while medians. n 
through negative selection, only three o! . 
seniors scored higher than the highest enter, 
freshman—and that by mere 1 and 2 pol 
margins. e 

Table 2 shows some representative COIT in 
tions of the M.C.T. with course grade. gj 


EA a 
: : tic 
various subject matter areas. Theore 


Table 2 
Sorel tions of M.C.T. Scores with Course Grades — 
r P Engineering Group 
1. Engineering Courses 
i and Applied Mechanics 0.49* <.01 Eng. Seniors (N = 107) 
Median grades in 7 relevant courses 0.41* 5, Seni = 107) 
2,959 Orang <.01 Eng. Seniors (N = 107 
Drawing and Projection 0.39 <01 Arch. Eng. Sophs (N = 120) 
S (estimated quality of product) 0.32 «.01 Ag ise Sophs (N= gh l 
rafting i iis 3 z = 260) 
3, Math -Sctence 0.30 <.01 Eng. Freshmen (N = 264 H 
Chemist 
Alea 0.34 <.01 Eng. Freshmen (N = oes 
0.28 <01 Eng. Freshmen (N = 260 


* Th i 
ese coefficients are corrected for restriction in range of test Scores, 


Difficult New Test of Mechanical Comprehension 


Table 3 
M.C.T. Scores and Mortality in Engineering 
i N Q Md Q 
l. Passing 93 49 43 37 
2. Failed 67 40 34 31 
3. Dropped voluntarily 94 43 39 34 
4. Changed divisions 57 5 40 33 
>. Changed institutions 7 42 T 
Total = 318 


ssing vs. Failed, ry; = 0.59; Failed vs. All Others, 
"bis = 0.46, 


and Applied Mechanics (r=0.49) involves the 
Physical science subject matters of “statics” 
and “dynamics.” Correlations with senior- 
level Courses have been corrected for restriction 
ìn range of test scores because primary interest 
Centers in the predictive value of the test as 
applied to freshman populations. Even so, 
the indicated relationships almost certainly 
Provide underestimates of the corresponding 
true values, as the Registrar’s Office at Towa 
State College allows repetition of a course with 
only the last grade obtained appearing on the 
official record. While it is true that the cor- 
relation of test scores with shop grades is not 
Ugh, two circumstances seem noteworthy: (1) 
here was only one experienced instructor in 
the group of five who estimated quality of shop 
Product, and the test correlated 0.48 with his 
ratings; and (2) The A.C.E. and high school 
*Yerage correlate only 0.09 and 0.11, respec- 
a against this same shop criterion, thus 
Making it appear that the M.C.T. predicts it 
"elatively well. 
able 3 shows the relationship between the 
Subsequent dispositions of 318 engineering 


Table 4 
M.C.T. Prediction by Achievement Level 
Low High 
(F's) (A's) 
Entire vs. vs. 

Course Group Balance Balance N 
Math. (Algebra) 0.28 0.30 0.41 260 
Chem. (General) 0.34 0.26 0.48 260 
English (Comp'n) 0.16 0.10 0.16 260 


Combined probability that “Low” and “High” r’s do 
not differ significantly = <.01. 


freshmen tested during the Spring quarter of 
1947, and the M.C.T. scores which they made 
at thai time. All but three of the original group 
are accounted for; one has been killed and two 
have records too confused to be useful. The 
Table gives the status of all subjects as of mid- 
April, 1949, at which time those listed as 
“passing” were third quarter juniors. It may 
be noted that the pass-fail biserial correlation 
which appears below the table furnishes a most 
encouraging estimate of test validity, and that 
even if all “voluntary drops” and “transfers” 
be classed as "successes" the relationship with 
test score remains substantial. 

Table 4 shows how achievement level relates 
to the efficiency with which the M.C.T. will 
predict grades in a sampling of freshman 
courses. As estimated from the 3 combined 
probabilities of a real difference, it appears that 
the correlations in Column 3 are significantly 
larger than those in Column 2, and that the test 
is, therefore, a better predictor of the grades 
of the more able students than it is of those of 
the less able. Since it was originally intended, 
in the construction of Form CC, to produce a 
test which would differentiate at high ability 


Table 5 
Prediction of M.C.T. with A.C.E. and High School Average (H.S.A.) 
SS. 
E _ AGE, Per Cent of 
1) (2) (3) A.C.E. H.S.A. Variance 
6 ck E. ELS.G. M.C.T. H.S.A. M.C.T. (1), (2) & (3) 
ourse AG. S. 
Drawi 
[e ing) 0.33 0.25 0.30 0.34* 0.39* 44-15-41 
^ r f 
emistry 5 34 0.58 0.601 33-47-20 
5 .53 0. 
x wa pe oa? 0.37 0.58" 0.61” 4130123 
Wr. Av, js x 


* p» s = <.01. 
Probability of non-significant difference < 


80 


William A. Owens, Jr. 


Table 6 
Background Factors and M.C.T. Score 


Characteristic 


Highest Score 


Chance 


Entire time 
Agriculture 


Time on farm 
Father's occupation 
Years of shop classes 
Years shop experience 
Years physics classes 
Age 


More than one 

More than one 

One half year or more 
25 years or more 


F P 
" — Lg eee 
2.09 2.05 
7.60 <.01 
1.25 >.05 
0.87 >05 
1147** <.01 
0.05 2.05 


m— 


** Score difference, Physics vs. Non-Physics means, 
covariance, score difference is 2.5 points. 


levels, these results are as desired and suggest 
that the stated purpose has been realized. 
Table 5 shows that M.C.T. score makes a sig- 
nificant contribution to the prediction of first 
quarter grades, above and beyond those made 
in combination by score on the A.C.E. and by 
high school average. In each case the inde- 
pendent contribution of the M.C.T. is highly 
significant in spite of the fact that the criterion 
is not particularly appropriate. Column 6, at 
the extreme right, contains estimates of the 
percent of the total predictable variance in 
course grades contributed, successively, by the 
A.C.E., high school average, and the M.C.T. 
The data for Table 6 were obtained by secur- 
ing certain presumably relevant information 
about the backgrounds of 177 Agricultural 
Engineers who subsequently took the M.C.T. 
Responses to each item were assigned to one 
of two or three categories, and the test scores 
: of the respondents were grouped accordingly; 
analysis of variance was then applied to esti- 
mate the significance of the intergroup differ- 
ences in test scores. As indicated, only two of 
the six potential relationships were statistically 
significant. Confirming studies on urban- 
tural differences, subjects whose fathers were 


Table 7 


Correlations of M.C.T. with Kuder P.R. 
for 260 Students 


Kuder Scales 
Co Sc P A L Mu 


20* —03 26* —.07 —02 —.05 .17* 


Me 


* 196 r = 0.16. 


PET EN m Wu Narr 
is 4.5 points; initial ability controlled, via analys 


work 


a i i sricultural 
engaged in some phase of agricultura E 


did have mean test scores about 3 raw Ee. 
points higher than those whose fathers we 
otherwise classified occupationally. Likew! of 
subjects who had had one-half year or T^ j 
high school physics had mean test scores an 
proximately 4.5 raw score points higher t 
those who had had less than this amount- hat 
will, however, be immediately grantec 
there is some tendency for only the v ec 
superior students to elect this difficult su5J jed 
Analysis of covariance was, therefore, app by 
to the results in the manner suggestec er 
Lindquist (4) to control on inter-grouP e E 
ences in initial aptitude as evidenced by ^ "ut 
scores. This had the effect of reducing and 
difference between the adjusted physics ya" 
non-physics means to approximately ^7 g i 
Score points. Such a result is interest e 
that mechanical comprehension tests e 0€ 

sometimes been criticized for their und" ap 
pendence upon formal physics trainin jg 
parently a relatively unwarranted CI, qf 


itially 
act 


Table 7 gives the intercorrelation* 4, , 
M.C.T. and the Kuder Preference Reco gpl 
will be noted that the four starred C0? ‘cat 
differ significantly from zero, but insigni? inf 
from each other. Some motivation? je | 
ences may be indicated in the fact that? " | 
Who are above average in scientific ? P" 


chanical interest tend to score high® à v 
M.C.T. It is also interesting to 059? enti 
the four significant relationships ' ie / 
would fairly well define a typical €” ce 
profile on the Kuder, except for the? " 
Computational interest. 


Difficult New Test of Mechanical Comprehension 81 


Conclusions 


The following conclusions seem warranted 
relative to the role of this new test of me- 
chanical comprehension in engineering student 
selection, 

1. Tnan engineering population, it yielded an 
odd-even reliability of 0.80 and correlated 0.39 
With the A.C.E. Psychological Examination. 

2. It yielded hierarchies of scores, by groups, 
Which accorded with expectation. 

3. It showed satisfactory correlation. with 
relevant course grades. 

4. It was an excellent predictor of academic 
mortality, 

>. It predicted better at high achievement 
levels than at low. 

à 6. In prediction of course grades, it made a 
Significant independent contribution beyond 
that made by A.C.E. and high school average. 

!- Score on it seemed to depend upon back- 
Sround of training or experience to the extent 
of Only 2 or 3 raw score points. 


8. It correlated with scores on the Kuder 
Preference Record about as would be expected 
of a test of mechanical or engineering aptitude. 


Received July 8, 1949. 


References 


1. Baker, Gertrude, and Peatman, J. C. Tests used 
in Veterans Administration advisement units. 
Amer. Psychologist, 1947, 2, 99-102. 

2. Bennett, G. K., and Cruickshank, R. M. 4 sum- 
mary of manual and mechanical ability tests. 
New York: The Psychological Corporation, 1942, 
Pp. 75. 

3: Kellum, W.E. Recent developments in selection of 
candidates for aviation training. Amer. J. Psy- 
chia!., 1943, 100, 80-84. 

4. Lindquist, E. F. Statistical analysis in educational 
research. New York: Houghton Mifflin Com- 
pany, 1940. Pp. 257 and appendix. 

5. McDaniel, J. W., and Reynolds, W. A. A study of 
the use of mechanical aptitude tests in the selec- 
tion of trainees for mechanical occupations. 
Educ. & Psychol. Measmt., 1944, 4, 191-197. 

6. Traxler, A. E. Correlations between “mechanical 
aptitude” scores and “mechanical comprehen- 
sion” scores. Occupations, 1943, 22, 42-43. 


Study of Executive Leadership in Business. 
III. Goal and Achievement Index * 


C. G. Browne 


Wayne University 


The importance of the establishment of or- 
ganizational goals, purposes, and objectives; 
the need for the effective communication of 
these established goals throughout the organ- 
ization; and the evaluation of past achieve- 
ments of the organization have been recognized 
by various writers on organization and execu- 
tive activities. These factors are important 
because they are influential in decision-making, 
one of the characteristics of executive activity 
and leadership, which should be based on a 
purposive moving toward common objectives 
and which extends through all levels of super- 
visory activity. A method which can be used 
to study the effectiveness of these three func- 
tions in any given situation among any given 
group of employees will constitute a valuable 
addition to the total study of executive leader- 
ship and the eventual evaluation of executive 
performance. 


Procedure 


This article is devoted to the presentation of 
a method, hereinafter referred to as the Goal 
and Achievement Index, which has been ap- 
plied on an exploratory basis to the study of 
23 top executives in a tire and rubber manu- 
facturing company. Previous articles have 
presented other methods used in the study 
(25). 

The subjects for the total study constituted 
all of the executives on the first, second, third, 
and fourth echelons of the business with the 
exception of one executive on the third echelon. 
They were classified into the following depart- 
mental groups: general administration, 4 cases; 
sales, 6; finance, 4; manufacturing, 8; and 
personnel, 2. 

The detailed procedure followed in devising 
the Goal and Achievement Index form and in 


* The writer expresses his appreciation to Dr. C. L. 
Shartle for his original suggestion that a goal and 
achievement method be included in the explorations. 

. | By his choice, the Vice-President-Sales was not 
included in the Goal and Achievement Index data. 


82 


accumulating the data on it has been described 
elsewhere (1). However, the 


general 
cedure was as follows: 


pro- 


1. In a personal interview, each executive enumer- 
ated the goals of the company as he understood they 
existed in the planning of top management. The 96 
goals enumerated then were classified into the five 
major departments of the business to which they best 
applied. 

2. Seventeen representative statements were chosen 
and arranged in the same random order on two scales, 
one scale for the evaluation of the statements in terms 
of their importance as goals and the other for their 
evaluation in terms of the degree of company achieve- 
ment in each. 

3. The executives rated the statements on a five 
point rating scale which ranged from “extremely im- 
portant” to “no importance" as goals and from *'com- 
pletely achieved” to “nothing done” in achievement 
terms. 

4. In calculating the individual goal and achievement 
scores, a point value of from one to five was assigned 
to each of the rating columns, the lower point value 
indicating a higher rating column. The individual’s 
score on each scale was the total points for the 17 state- 
ments as he rated them, a low score indicating higher 
goal importance and degree of achievement, To deter- 
mine the rank order of the statements by departmental 
groups, the individual points for each statement of the 
executives in each departmental group were added, and 
the statements were arranged in rank order on the basi 


of the total number of departmental points for the 
Statements. 


Total Group Ratin gs 


Table 1 includes the composite ratings of the 
total group of 23 cases, with the rank order ° 
each statement for both goal importance and de 
gree of achievement indicated, and the difference 
between the rank orders. Where the differenC? 
between goal and achievement was a plus valu®? 
it indicated that achievement in that particula? 
item was considered to be above its importan? 
as a goal of the company, and when tP? 
difference was a minus value, the opinion wa f 


that the item had more importance as a £9 
than the company had thus far achieved with! 5 


The data in Table 1 can be analyzed in tern 
of departmental activities as indicated by b 


bas We 


^ 


> 


Study of Executize Leadership in Business. JIJ SF 


Table 1 


Rank Order of Composite Goal and Achievement Ratings for Total Group of 23 F 


4 
2 3 Diff. 
Goal Ach. Between 
T Rank Rank G&A 
Statement of Purpose Dept. Order Order Rank 
l. To create big production. M 14 7 +7 
2. To make as much money as possible for the stockholders. F 13 11 +2 
3. To continue to keep Congo a successful and prosperous 
company. GA 1 6 =5 
4. To establish a more definite and concrete sales policy. 5 9 13 =$ 
5. To develop a better community and help the general 
prosperity of the cit P 15 45 +10.5 
6. To broaden its field of activity by making a wider variety 
of products, * M 7 12 es a 
7. "To make as fine à tire as any company can make and 
Sell it a little cheaper. M 5.5 45 TA 
8. To Promote good labor relations and have satisfied 
workers, P 2 T2 
9. To make a fair margin of profit. F 3 14 =i 
10. To make a good and economical tire and sell it to private s 7 
brand, mass distributors at a good price. S 10 8 + 
11. To increase the proportionate volume of Congo brand J A » " 
business handled. i S m = 15 ES 
12. To develop an extensive, national advertising program. S 17 3 17 » 0.0 
13, To keep the company growing and expanding. GA 11.5 10 rs 
14. To expand the sales force in the field. E 16 16 0.0 
15, To Come out on top with a good quality tire and high = " a 
Production, 2 M ii E 
16. To Provide good working conditions and good living à 8 1 4? 
Standards for employees. : 
17, To attempt to cut costs and expenses and put the com- E 2 3 = 


Pany back in the profits brackets. 


" Department Code: GA—General Administration; F—Finance; 


S—Sales; M—Manufacturing; P—Personnel. 


be department designated for each statement refers to the department of the company to which the statement 


best Applied in the operational activities of the company. 


departmental classification of each item, and 
59 individual goals can be studied without 
rear for their departmental classification. 
dea ly it could be hoped that the rank order 
of the Statements in terms of goal importance 
vould be the same as the rank order of the 
‘tatements in degree of achievement terms. 
'S agreement would indicate a balance be- 
wren these two important areas of executive 
i tivity and leadership. However, where dif- 
ences are shown to exist, the opportunity is 
Visi „to top management to study Et att 
regi lity of establishing policies whic E i 
atten a balance by concentrating = we 
has ‘tion and effort on those activities which i 
een indicated were not in balance. 


A study of the rank order of the statements 
by goal importance and degree of achievement, 
without consideration of the differences be- 
tween the ranks, will provide information to 
top management which may be used in at- 
tempting to re-educate management in general 
or which will provide the basis for conference 
and consultation between management at 
various echelons regarding the proper place 
of various goals in the business and the extent 
to which these goals have been achieved. 

For purposes of illustration here, those state- 
ments relating to the finance department (nos. 
2, 9, and 17) and those relating to the personnel 
department (nos. 5, 8, and 16) will be examined. 
Statement 9 regarding a fair margin of profit 


84 


had a minus 11 difference, meaning that these 
executives believed the company had fallen 
far short of the great importance of this goal. 
In contrast to this, however, statement 2 “to 
make as much money as possible for the stock- 
holders" received a plus two difference, indi- 
cating that these executives believed the com- 
pany achievements had somewhat exceeded the 
goal importance of this activity. The remain- 
ing finance item with a minus one difference 
between ranks indicated that the company was 
working quite satisfactorily towards cutting 
expenses and putting the company back in the 
profits ‘bracket, the successful completion of 
which should change the rank differences on 
statement 9. 

When the statements relating to personnel 
activities were examined, it was noted that all 
three of the statements received a plus differ- 
ence, meaning that in all of these activities the 
executives believed the company had achieved 
more than the goal importance of these per- 
sonnel functions. The sensitivity of the Goal 

_ and Achievement Index form may be judged 
somewhat by the fact that, at the time the 
study was conducted, tire orders were falling 
off rapidly; the company was not operating at 
a profit; and it was the general executive 
opinion that the company could not re-establish 
itself on a profit basis in the highly competitive 
tire market until labor costs were reduced. It 
was also known that the union contract was 
about to come up for renegotiation and that 
in all probability there would be a demand for 
higher wages which would eliminate the possi- 
bility of decreasing labor costs. 


Communication and Departmental Group 
Correlations 


Table 2 gives the rank order correlations 
between goal and achievement ratings of the 
five departmental groups; and a composite of 
the president and general manager, vice presi- 
dent manufacturing, and treasurer (herein- 
after referred to as PVPMT). The correla- 
tion of .47 for the total group actually was the 
rank order correlation between columns 2 and 
3in Table 1. Similar tables were prepared for 
the rankings of each departmental group, and 
the rank order correlations computed from 
them. The correlations indicated the extent 


C. G. Browne 


Table 2 


Rank Order Correlations Between Goal and Achieve- 
ment Composite Ratings of the 17 Items 
by Departmental Groups 


A -— 88 


PVPMT Mig. 
22 A9 
General 
Finance Adm. Personnel 


28 .66 64 


of agreement among executives in a given de- | 
partment on the relationship between the im- 
portance of company goals and the degree ot 
achievement? 
Communication refers to | 
policies and thinking of top management have | 
been carried through and absorbed by the rec — ,^ 
mainder of the organization. If there were à | 


how well the 


condition of perfect communication within the 
company on the questions of goal importance 
and degree of achievement on the type of state- 
ment included in the Goal and Achievement 
Index form, then it could be expected that the 
ratings of each departmental group on all of the 
items would be identical with the ratings of the 
PVPMT, since they represent top manage- 
ment. If the ratings by departmental groups 
were the same, the rank order of each state- 
ment would be the same for each departments 
and consequently the rank order correlation? — 
of the various departments would be equa^ | 
These identical correlations might vary fro™ 
minus 1.00 to plus 1.00, but as a measure —— 
of communication only, the size or the sigh 4 
of the correlations bears no relationship t° 
the effectiveness of communication. Insteal”’ 
the effectiveness of communication can 
measured only by observing the differences n 
the size of the correlations between vario" 
departmental groups and top management 
Therefore, the correlation of .66 for thé 
general administration group expressed the 
highest group ratings toward the degree ^, 
achievement in terms of the goal important 
of the various items. However, this corre 
tion expressed the greatest disagreement wit | 
the opinion of the PVPMT and, therefor 
indicated the least adequate communicatio" P 
? [t should be noted that these correlations Wer? n" 
scriptive of the relationships existing in this popu p m 


of executives only, and cannot be considered 25 
pling statistics. 


X 


Study of Executive Leadership in Business. III 85 


On the other hand, the correlation of .17 for 
the sales department, although it expressed the 
lowest departmental ratings toward the degree 
of achievement in terms of the goal importance 
of the items, was in closest agreement with the 
ratings of the PVPMT. 


Analysis of Individual Statements 


Although correlations and composite ratings 
are valuable for the general or over-all picture 
Which they indicate, correlations obscure much 
detailed data which are available in the Goal 
and Achievement Index form and which can 
be analyzed by comparing individual state- 
ments in terms of departmental ratings, 
echelon levels, and individual executives. For 
the purpose of this article, one of the state- 
ments will be considered in terms of the differ- 
ences in its ratings inter-departmentally. 

On statement 10 (Table 1), there was a 
Tange of 11.0, the rank extending from 3.5 to 

This statement was specifically im- 
Portant toa long-time company policy of selling 
tires under a private brand name (not Congo) 
to large retail outlet chain stores. The 
PVPMT gave this goal a 3.5 rank, indicating 
that this policy should be continued, and per- 

aps even emphasized. However, the sales 
department ranked this statement 14.5. This 
may have meant one of two things: (1) that 
the Sales group was not aware of top manage- 
Ment attitude toward this policy or (2) that the 
ae group was expressing its own opinion on 
© Policy without regard to what its know- 
“CRE of top management attitude was. Under 
any Possibility, statement 10 was classified as a 
Sales Statement regarding which the sales group 
Cviated widely from top management attitude. 
€ implications of this condition for the 
ficient operation of the company and the 
Performance of the executive leadership within 
© Company may be many. I 
hen the statements were considered inter- 
departmentally in terms of achievement ranks, 
g s highest range of 13.0 was for statement 1, 
tes Slfied as a manufacturing statement. The 
makings extended from 3.0 to 16.0. The 
thy Ufacturing group and the PVPMT gave 

5 @ 3.0 achievement rank, indicating their 
net that the degree of achievement was well 

at the top of company achievements. How- 


ever, the personnel group which reported to 
the vice president manufacturing ranked the 
statement 16.0, and the sales group ranked it 
next lowest with a 13.0. 

Throughout these detailed comparisons, in- 
formation was obtained which provided clues to 
the relationships existing inter-departmentally 
between executives in the business. One pat- 
tern which developed was a general tendency 
for executives in any given department to 
rank a statement which applied to their de- 
partment in the higher ranks obtained by that 
particular statement. For example, a state- 
ment which applied to the sales department 
was likely to be ranked higher both for goal 
importance and for degree of achievement by 
the sales executives than by any other depart- 


mental group. 


Rankings by Departmental Statements 


In terms of efficient company and executive 
relationships as well as providing a measure of 
communication and knowledge of company 
goals and achievements, the differences be- 
tween ranks assigned to statements grouped 
departmentally provided additional leading in- 
formation. For example, when it was found 
that items which were given little or no goal 
importance were given high degree of achieve- 
ment, or vice versa, then it appeared evident 
that there was a need for corrective activity 
in the particular aspect of company affairs 
covered by the item. 

Table 3 presents the average rank differences 
between goal importance and degree of achieve- 
ment rankings of the statements grouped 
according to the department to which they 
best applied. The rankings were the compo- 
site of each departmental group of executives. 
A plus value indicated that the degree of 
achievement for any given group of depart- 
mental statements as ranked by any given 
group of departmental executives was greater 
than the goal importance of the same group of 
departmental statements as ranked by the 
same group of departmental executives. A 
minus value, on the other hand, indicated that 
the degree of achievement for any given group 
of departmental statements as ranked by any 
given group of departmental executives had 
not reached the goal importance of the same 


86 


C. G. Browne 


Table 3 


Average Rank Differences Between Goal Importance and Degree of Achievement Rankings of Departmental 
l Groups of Executives on Goal and Achievement Statements Classified into 


the Department to Which They Best Apply 


Sales M 


Genl. Adm. Finance Personnel anufacturing 
Exec. Statements Statements Statements Statements Statements 
Group 5 = or = T =a LM NE d am 
Total 17 3.3 6.5 1.1 4 
PVPMT 25 ns 82 E! 24 
Mig. 45 10 6.5 1.2 E! 
Sales 4.7 3.7 6.2 27 29 
Personnel 5 3.0 40 6 0.0 0.0 
Finance 2 4.3 9.0 2.7 5 
Genl. Adm. [7 42 4.2 E 4 A 
group of departmental statements as ranked on the relationship between the goal and 


by the same group of departmental executives. 

Tt will be noted that, with the exception of 
the personnel statements, there were only four 
cases of plus values. That is, the executives, 
regardless of department, believed that the 
company had over-achieved in those items 
which relate to personnel activity but in other 
activities there was almost consistent agree- 
ment on under-achievement. The personnel 
executives with an average plus difference in 
rankings of 4.0 expressed the smallest over- 
achievement value on personnel statements. 
On the horizontal rows, the manufacturing 
executives gave their lowest under-achievement 
value to manufacturing statements, and the 
sales executives gave a plus value or an over- 
achievement ranking on sales statements. On 
the other hand, however, the finance executives 
ranked their own departmental statements 
fifth or last. This attitude on their part may 
have been indicative of a somewhat cautious 
or pessimistic attitude on the part of those men 
who were guardians of the company purse, or it 
also may have been a direct reflection of the 
particularly acute financial situation in the 
company at that particular time. 

Again in this phase of the study, there was 
evidence which suggested that the rankings 
given to goal and achievement statements were 
related to (1) the department to which the 
executive was assigned and (2) the type of 
work which the executive did. 


Individual Goal and Achievement Scores 


'The individual goal and achievement scores 
and the three scores for each executive based 


achievement scores are included in Table 48 

For the G/A calculations a score of 1.00 10- 
dicated no difference between goal and achieve- 
ment scores. In other words, the individua 
with a 1.00 G/A score ranked the 17 state- 
ments for degree of achievement of the com- 
pany equal to the goal importance of the state- 
ments. Therefore, it follows that a G/A score 
of less than 1.00 indicated that the degree ° 
achievement of the company had not equalle 
the goal importance of the statements, and à 
score of more than 1.00, that achievement ha 
surpassed goal importance. 30 

The range of the G/A scores was from A 
to 1.00. With the exception of the one case 
who ranked goal importance and achievemen 
equal, all of the executives ranked achieveme? 
below goal importance. 

he G minus A scores represented th 
gebraic difference of the goal score and t f 
achievement score for the individual. 
the G minus A scores were minus, rangi”? 
from 0 to —28. This indicated that the deer 
of achievement had not reached goal imp? 0 
tance, with the exception of the one score O 
which indicated equality between achieveme 
and goal importance. In this study itx " 
somewhat simpler to work with the G mires 
A score since all of these scores were minu? jt 
or zero. If there were plus scores include 


H " ite / 
* To reduce printing costs, Table 4 has been depo ay 


e ak 


with the American Documentation Institute and oc! "d 


be ordered as Document 2741 from American .- of 
mentation Institute, 1719 N Street, N.W., Washing P 
6, D. C., remitting $0.50 for microfilm (images 15041 
high on standard 35 mm. motion picture film) 0T ^c? 


for photocopies (6 X 8 inches) readable without op 
aid. 


| 


| 


Study of Executive Leadership in Business. III 87 


would then simplify the study to work with 
the G/A score. 

Both the G/A and G minus A scores can be 
used to measure the ratings of individuals to- 
ward Company goals and their relationships 
to achievements of the company. However, 
to measure communication and knowledge of 
Company goals and achievements as expressed 
by top management, some other score must be 
used. For this purpose, the algebraic differ- 
ence between the individual executive's G 
minus A score and the G minus A score of the 
President and general manager was taken. All 
of these scores were plus values with the ex- 
ception of three cases. This indicated that, 
with the three exceptions, all of the executives 
ranked the company’s achievements in relation 
to goal importance greater than the president 
and general manager did. As a measure of 
communication, however, it was immaterial 
Whether the individual’s disagreement with 
the president and general manager was a plus 
ora minus score. The sign merely indicated 
the trend of the difference in terms of goal 
importance and achievement relationship, but 
a minus score had the same communication 
meaning as a plus score. : 

As another phase of the study, the G minus 
A scores and the G minus A deviations from 
the president and general manager were con- 
Sidered, arranged first by departmental groups 
of executives and then by echelon groups. The 
analyses of these results were characterized by 
two factors: (1) the executive contacts which 
the individual had within the business, and (2) 
the extent to which the work performed by 

€ executive was related to the major over-all 
Unctions of the business. While this general 
Pattern could not be supported and justified 
With each individual executive, the tendency of 
the pattern was strong enough to justify the 
Tecommendation that it be pursued further in 


other research, 
Summary 

The results of an exploratory study with a 
method for studying executive leadership 
Called the Goal and Achievement Index have 

en presented, the data being based on a 
8toup of 23 executives in a tire and rubber 
manufacturing company. The data were ana- 


lyzed for the total group of executives, for the 
executives grouped into the major departments 
of the business, and for individual executives. 

On the basis of this exploratory study, it 
appears that the approach used in the Goal and 
Achievement Index offers promising possibi- 
lities for the study of executive function and 
leadership in business as it is related to (1) 
knowledge of company goals and achieve- 
ments; (2) the relationship between goals and 
achievements; and (3) the effectiveness of the 
communication of this knowledge and relation- 
ship from top management through the super- 
visory levels of the business. 

In the analysis of goal and achievement 
scores, two fundamental variables need to be 
particularly considered. The first has to do 
with the goal and achievement statements and 
the individual's evaluation of them at any given 
time. This variable is a constantly changing 
one, subject to variation in accordance with ` 
the conditions existing within a particular 
business at a particular moment. 

The second variable is the relationship ex- 
isting between the scores and ratings, inter- 
individually and inter-departmentally. It can 
be postulated that this relationship will remain 
relatively constant, unless there are changes 
in the personnel studied or unless the company 
attempts some type of a training program 
which will bring the executives, individually 
and departmentally, into greater agreement 
on and provide better understanding of com- 
pany goals and achievements. "Therefore, this 
factor is not subject to the same variability as 
the evaluation of individual goals and achieve- 
ments since it is dependent upon the executives 
as individuals and not upon the constantly 
changing factors in company operations. 


Received August 8, 1949, 


References 


1. Browne, C. G. An exploration into the use of cer- 
tain methods for the study of executive function 
in business. Unpublished Ph.D. dissertation, 
The Ohio State University, 1948. 

2. Browne, C. G. Study of executive leadership in 
business. I. The R, A, and D Scales. J, appl. 
Psychol., 1949, 33, 521-526. 

3. Browne, C. G. Study of executive leadership in 
business. IT. Social group patterns. J. appl, 
Psychol., 1950, 34, 12-15. 


To What Extent Have the American People Accepted Socialism? 


Henry C. Link and Albert D. Freiberg 
The Psychological Corporation, New York City 


In 1928, the national convention of the 
Socialist Party adopted a platform including, 
among others, the following domestic planks: 

“4, Public ownership of all national re- 
sources and public utilities, including oil wells, 
coal mines, power plants, railroads, and tele- 
phone and telegraph facilities. 

2. Governmental relief of unemployment by 
extension of all public works, Federal loans 
without interest to states and municipalities to 
promote public works, old age pensions, un- 
employment, health, and accident insurance, a 
shorter work day. 

3. Acquisition of grain elevators, stock 
yards, and other distributing agencies by the 
government or by bonafide cooperating socie- 
ties; cooperative purchasing, marketing, and 

' credit agencies; government insurance against 
crop losses; flood control.’ 

The intervening years have seen the enact- 
ment of some of the Socialist goals and the 
present administration has recommended legis- 
lation which would, in part, achieve others. 
The attitudes of present day citizens toward 
Socialism, their definition of socialistic aims, 
and their opinions of existing or proposed 
activities which might be regarded as imple- 
mentation of socialistic philosophy constitute 
an interesting area for opinion research. 

In a previous article? we have reported the 
finding that 67% of an urban sample regarded 
Communism as a “dangerous thing in the 
United States" while 26% of the same group 
regarded Socialism as dangerous. Sixty-one 
percent of the respondents felt that Commu- 
nism and Socialism were different. 

In October, 1949, a random subsample of 
1000 adults out of the 5000 interviewed in con- 
nection with the Psychological Barometer were 
asked a series of questions designed to yield a 
more detailed account of opinion on this topic. 

1 Socialist Party (U.S.) Socialist Campaign, 1928. 
As reported in the WORLD ALMANAC of 1929, the 
Republican National Convention of 1928, specifically 
endorsed none of these demands except for a mention 
of cooperative farm marketing agencies. The Demo- 
cratic Convention favored government support (but not 
subsidy) of cooperative farm marketing agencies, public 
works to relieve unemployment, and flood control. 

* Link, H. C., and Freiberg, A.D. The Psychological 


Barometer on Communism, Americanism and Socialism. 
J. appl. Psychol., 1949, 33, 6-14. 


88 


Q. Are you for or against Socialism in this country? 


Answers 96 
Against Socialism 15:33 
For Socialism 6.3 

84 


Don't Know 18. 


By the various socio-economic groups, these 
answers were as follows: 


Upper Lower 
Answers Upper Middle Middle Lower 
% % . 96 % 
Against Socialism 90 80 74 64 
For Socialism $ 6 7 8 
Don’t Know Hi 14 19 2 


This was followed by: 


Q. What does Socialism mean to you? 
The 75% who were against Socialism and the 6% for 
Socialism described it as follows: - 
‘ sini —P 
% A 
Answers Against For 
Government ownership or control of in- 
dustry, utilities, natural resources, 
health and welfare, medicine and 
doctors 26 30 
Too much Government control; dicta- 
torship; regimentation 18 
Distribution of wealth; fellow who pro- 
duces gets no more than the one who 
doesn’t 12 a 
Destruction of capitalism; restricting or 
doing away with private enterprise 
and competition; disruption of our 9 
economic system; ruin of our country 8 y 
Socialism is related to Communism, leads 
to Communism 9 
Equality for all; no racial discrimina- E 
tion; a hopeless crusade for equality 5 a 
Socialism is like the English government 
of today 3 
Workmen’s compensation; social secu- 
rity; the “Welfare State” 1 
Collectivism; collective Government; 
planned economy 1 6 
Miscellaneous answers 9 0 
Don’t know 15 h 
5 
Total Answers 809 T 


Total Interviews (Base for per cents) 753 


Have the American People Accepted Socialism? 89 


Evidently the majority who say they are 
against Socialism tend to identify Socialism 
With increasing Government control, operation 
and ownership. 


Specific Issues 


: In order to bring the general issue of Social- 
ism down to specific cases, the following series 
of questions was then asked of everyone in our 
1000 interview sample. 


Q. Which of the following do you think are steps 
toward Socialism, and which are not? 

a. The minimum wage law where the Government 
Sets a minimum wage below which no (interstate) 
business can pay? 

b. The T.V.A. and other Government-owned flood 
Control and electric power projects? 

€ Payroll taxes or deductions from a person’s pay to 
Provide old age, unemployment and other benefits? 

d. Food subsidies where the Government buys eggs. 
Potatoes, and other farm products to keep their 
prices near parity? 

€ Rent control where the Government decides how 
much rent can be asked for houses and apartments? 

G Government housing where the Government builds 
apartment houses which are rented to low income 
people at rents below actual cost? 

8- Government rules which require that only union 
labor can be used on Government building? 

h. Peace-time price control where the Government 
decides how much a business can ask for its 
Products? 


_ The responses of those 753 persons who pre- 
Viously had reported themselves as opposed to 
Socialism are summarized at topof next column. 


Conclusions 


Although 75% of a sample of urban adults 
"'épresent themselves as opposed to Socialism, 
Within this 75% each of eight types of govern- 
Ment activity which might logically be regarded 
33 specific instance of socialistic practice finds 
* considerable number who favor it and a 


Yes, Favor 
It Is This 
Items Socialistic Measure 
% % 
a. Payroll taxes for old age and 
other benefits, etc. 27 81 
b. T.V.A. and similar projects, etc. — 37 66 
c. The minimum wage law, etc. 38 62 
d. Government housing, etc. 46 59 
e. Rent control, etc. 50 43 
f. Government rules requiring 
union labor on Government 
building, etc. 54 19 
g. Food subsidies for farm prod- 
ucts, etc. 55 24 
h. Peace-time price control, etc. 67 18 


number who do not regard the practice as “a 
step toward Socialism." 

Four of these areas, payroll taxes for old 
age benefits, T.V.A. and flood control, mini- ` 
mum wage laws, and government housing find 
favor with a majority of those who oppose 
Socialism in general. Furthermore, in the case 
of each of these categories fewer than half of 
those against Socialism regard the measure as 
socialistic. 

The more frequently a specific type of ac- 
tivity is regarded as Socialistic, the less fre- 
quently is it viewed with favor. 

Some of the planks of the 1928 Socialist 
platform have been translated into practice. 
Several of these are not now regarded as "steps 
toward Socialism" and are favored by a ma- 
jority of those adults who say they are against 
Socialism in the United States. 

Even though Socialism may be a scare-word 
to 75% of the people, it does not follow that 
this fear will be translated into the fear of 
specific measures such as Government owner- 
ship of public utilities, government housing, 
“socialized” medicine and the like. 


Received January 18, 1950. 
Early publication. 


The Selection of Patrolmen 


Philip H. DuBois and Robert I. Watson 


Washington University 


Relatively little is known concerning selec- 
tion of police officers by psychological methods. 
In a recent exhaustive review of the validity 
of commonly employed occupational tests, 
Ghiselli! reports briefly on previous work. 
Without either specifically identifying the 
particular tests utilized or the studies from 
which they came, he presents certain correla- 
tions with criteria of job proficiency. The 
median coefficients of validity with various 
kinds of tests were as follows: intelligence .28, 
immediate memory .40, arithmetic —.12, 
number comparison .25, name comparison .35, 
finger dexterity .20, and personality question- 
naires .24. With the exception of intelligence 
in which eight coefficients were involved, most 
of these validities were based upon one or two 
correlations. The total combined N in all the 
studies numbered less than 500. The magni- 
tude of the correlations, even in the absence 
of information about their reliability, shows 
that this field of personnel selection has 
certainly not yet been satisfactorily explored. 
With Ghiselli it is agreed that “about all that 
can be said with any degree of certainty is that 
intelligence tests give fair results and that 
personality measures would appear to be 
promising."? 

The present study on the prediction of suc- 
cess of patrolmen in training and on the job 
was undertaken at the request of the St. Louis 
Board of Police Commissioners. In late 1947 
an extensive reorganization of the Police De- 
partment was in progress. New equipment 
was secured, new office procedures were insti- 
tuted, the crime detection laboratory was re- 
organized and expanded, a Police Academy for 
recruit and in-service training was organized, 
and plans were made for the introduction of 
psychological methods for the selection and 
evaluation of personnel. As a first step in the 
application of psychological methods to per- 


1Ghiselli, E. E. The validity of commonly em- 


ployed occupational tests. Univ. Calif. Publ. Psychol., 
1949, 5, No. 9, 253-288. 


2 Op. cit., p. 269. 


sonnel problems, the authors were invited to 
participate in the selection of classes of pro- 
bationary patrolmen. 


The Sample 


Table 1 presents information on the size and 
composition of the sample at various steps 1n 
the procedures. It will be noted that 1255 
applications for appointment were processed. 
Tt was found, however, that only 512 of the 
men met the various requirements set up by 
the Board, as described in Table 1, as to 
education, age, height, weight, residence; 
citizenship, and satisfactory character. As 
many of the applications had been on file for 
some months, these 512 men were canvassed 
as to their continued interest in being appointed 
to the Force. The 312 still interested were 
invited to take the selection tests, with 253 
actually appearing for the examination. 

Of these, 180 were found to meet the selec- 
tion criteria set up for the psychological test 
battery shortly to be described. Twelve men 
were eliminated by failure to pass the physica 


Table 1 
Derivation of the Sample Studied 
N 

Applications for appointment m 
Applicants meeting established standards 

(Height 5/9" to 64"; weight 150 to 210 Ibs.; 

age 22 to 31 years with year-for-year waiver 

for military service; residence 2 years in city; 

citizenship; education 10th grade or equiva- 

lent; good character as shown by police inves- 

tigation) 312 
Applicants requesting examination 253 
Applicants appearing for examination 180 
Examinees meeting cutting scores 

(AGCT score of 100 or more, CWF-2 of 9 or 

less) 168 
Passed physical examination 139 
Passed by Screening Board 139 
Entered Classes 1 and 2 429 


Sample used in present study 


AT 


The Selection of Patrolmen 91 


examination, leaving 168 who were interviewed 
by a Departmental Screening Board. The 
Screening Board consisting of five members of 
the Department, including the Chief of Police, 
rated each man on five traits: appearance, 
manner, speech, adaptability, and general im- 
pression, giving scores of 1 for the lowest and 
3 as the highest score in each of the character- 
iIStics. A score of 125 was perfect and a score 
of 75 or above was considered acceptable. 
Twenty-nine candidates were eliminated on the 
basis of the screening board rating. The re- 
maining 139 were appointed as probationary 
patrolmen, entering the Police Academy in two 
Successive eight-week classes, which graduated 
in June and August, 1948. The present study 
'S concerned with 129 of these 139 cases. Of 
the ten cases not being reported, two resigned 
while in good standing at the Academy, five 
failed in the curriculum, while in the three 
other instances data were incomplete for one 
reason or another. The number of failures in 
training was not large enough for consideration 


in this paper, although the problem is scheduled 


for study at a later date. 


Tests Administered 


The battery of psychological selection and 
Experimental tests used in this study is listed 
in Table 2. Before turning to the results 
each instrument will be described. 

The St. Louis Police Aptitude Test was 
developed by the writers in consulation with 
the departmental staff. All material was 
Prepared in the police context, so that it had 
Ugh face validity. Realism was obtained by 
Using police photographs and excerpts from 
Police manuals. The items were screened by 
*Xperienced police officers who decided whether 
9r not they were plausible in a police setting. 

n its final form the test consisted of 90 
Multiple choice items divided into five sections. 
The first section involved 15 memory items. 
They were of two sorts: memory for photo- 
Staphs and names, and memory of a scene of a 
Crime and of a scene of an accident. A touch 
of verisimilitude was added by using the pho- 
tographs of recent criminal suspects. „The 
Names were selected at random. The accident 
and crime scenes were posed and involved in 
“ach instance a variety of clues. Typical of 


the questions asked was one involving the 
locus of the crime. The second section of 15 
items required the applicant to select the one 
misspelled word from a list of five or to select 
the.one correctly spelled from a list of words 
otherwise incorrectly spelled. Typical words 
included alias, waiver, accomplice and felony. 
Three reading selections made up the third 
section. These were reports of a fire, a rob- 
bery, and a selection from the police manual 
concerning rules in regard to the making of 
arrests. Each was followed by five multiple 
choice reading comprehension items. The 
next section consisted of 30 items concerned 
with general information and judgment related 
to police work. Police practice, causes of 
crime, the purpose of teaching first aid to 
policemen and the reason automobiles need to 
be inspected regularly were typical of the bases 
for these questions. Care was taken that 
specialized knowledge generally known only 
to a person with police experience was not 
required. The last section involved 15 arith- 
metical problems couched in a police depart- 
ment setting. The test as a whole required 90 
minutes, including directions. Originally it 
was proposed to use a cutting score with this 
test, but since the correlation with the AGCT 
was higher than anticipated it was used only 
experimentally with the first two classes. 

The Army General Classification Test, First 
Civilian Edition, was actually used for selec- 
tion. An AGCT score of 100, corresponding to 
a percentile rank of 48, was the minimum 
acceptable. The two classes had a mean 
AGCT score of 118 or a percentile rank of 80 
in the general population. This score stands 
at the 75th percentile of the distribution re- 
ported for policemen. 

Candidates were asked to write a short com- 
position entitled “Why I Wish to Become a 
Policeman.” This essay was graded for hand- 
writing, using the Thorndike Handwriting 
Scale, and for coherence in English expression, 
using a five-point scale. No candidates were 
actually eliminated on this test but the lowest 
rating essays of men who were otherwise 
qualified were scanned by the Chief of Police 
for acceptability. The final test included in 
the original screening battery was the Cornell 
Word Form—2, a controlled association test 
which has been used to detect neurotic ten- 


92 Philip H. DuBois and Robert I. Watson 


dencies. Scores of ten or more were con- 
sidered disqualitying. s. 

In the course of the eight weeks training at 
the Police Academy it was possible to ad- 
minister for experimental purposes a further 
series of tests. These included the Figure 
Matching Test, a speeded perceptual measure 
involving matching the silhouette of an object 
or a meaningless shape with five similar figures, 
one of which is the exact duplicate; the Bennett 
Mechanical Comprehension Test—Form BB; 
the Minnesota Paper Form Board MB; the 


Object-Aperture Test, an instrument requiring 
visualization in three dimensions and yielding a 
speed score and a power score; the Strong 
Vocational Interest Blank, which was com- 
pletely scored but which has been validated for 
only three scales—police interest, interest 
maturity, and occupational level; and, finally, 
the Rosenzweig Picture Frustration Study, 
developed as a measure of the direction of 
aggression and the type of frustration reaction. 

The Strong and the Rosenzweig measures 
were introduced to study the validities of 


Table 2 


Validities of Psychological Tests and Screening Board Rating for Prediction of Success as Probationary 


Patrolman (N = 129). 


Classes of June, 1948, and August, 1948, of Police 


Academy, Department of Police, City of St. Louis 


Academy Academy 
Grade Grade Achieve- á 
June August ment Marksman- Service 
Class Class Test ship Rating 
(N27) (N=5) (N=129) (N=139) (N-129 
Police Aptitude Test—Total 39" EZ "S 08 03 
Aussage (15 items) A0** 20 5 .20* 08 
Spelling (15 items) att Aa oo —.10 m 
Reading (15 items) 18 "m a” M — 03 
Information and Judgment (30 items) 32" gme amt —.01 12 
Arithmetic (15 items) am .30* A2** BL: — 07 
Army Gen. Classif. Test—Total .54** .50** ie 45 0 
Verbal Items E js E 02 -03 
Numerical Items A Ag gut at 02 
Blocks are ,30* E 16 44 
Handwriting —.03 04 06 —.06 10 
Composition 235 .30* 36** —.06 .08 
Cornell Word Form 2 —44 —.18 —.22* —.09 06 
Figure Matching (speeded perceptual) 25* :29* 23** .19* 03 
Bennett Mech. Comp. BB 28* 29* .20* are -10 
Minn. Paper Form Board MB 3g** 29* 30** 26** 04 
Object-Aperture Test—Speed 18 14 .09 07 10 
Object-Aperture Test—Power .28* .29* .17* .17* .12 
Strong Voc. Int.—Police Interest —.09 —.12 —.24** AS -01 
Strong Voc. Int.—Interest Maturity 14 —47 —.05 —.01 —.04 
Strong Voc. Int.—Occupational Level .06 .19 ,got* .01 03 
Rosenzweig Picture Frustration 
Obstacle Dominance —.03 —:09 —A1 12 08 
Ego Defensiveness 06 AT 01 a7 12 
Need Persistance —04 16 08 03 -40 
Impunitiveness .24* 04 10 : 13 —.05 
Intropunitiveness —.16 —.04 =, ‘04 n 42 
Extrapunitiveness E —.02 = 03 m 14 10 
% Group Conformity Rating —.01 .02 LAT 43 ‘04 
Screening Board Rating 01 jl —.01 06 —.03 


** Significant at 1% level. 
* Significant at 5% level. 


| 


^ 


Py 


The Selection of Patrolmen 93 


Certain interest and personality characteristics 
for police work. The other experimental tests 
Were non-verbal. instruments which were con- 
sidered to be possibilities for supplementing the 
selection tests, success in which depended 
largely upon verbal ability. 


Validities 

Four criteria of validity are used in this 
Study: the final grade in the Police Academy, 
treated separately for the two classes; an 
achievement test based on Perkins’ “Elements 
of Police Science,” which was required reading 
after completion of the course and on which 
the patrolmen took an examination approxi- 
mately 10 weeks after graduation; marksman- 
Ship during the academic training; and service 
rating after 10 weeks on duty. The service 
rating was based upon an 11-trait rating scale 
With five descriptive steps for each trait. The 
traits were work attitude; loyalty, interest and 
enthusiasm; judgment; report writing; inves- 
tigative ability; alertness; bearing and de- 
meanor; speech; appearance; contacts with the 
Public; and usefulness to the service. Ratings 
Were made by the sergeant who directly super- 
vised the work of the patrolman and were ap- 
Proved by superior officers in the chain of com- 
mand. Only the validities against the over-all 
Service rating are reported at this time. 

In Table 2 validities are reported for all 
Predictor variables against these four criteria. 
It will be noted that for the criteria of academy 
8rades and the achievement test, the verbal and 
numerical sections of the Police Aptitude Test 
and Army General Classification Test have the 

Ighest validities, closely followed by the Com- 
Position Test and the non-verbal experimental 
tests. For 72 cases reported in the June Class 

ere are ten correlations significant at the 

6 level and five more at the 5% level. For 

© August class of 57 cases, there are seven 
Correlations significant at the 1% level and 
Seven more significant at the 5% level. 

8ainst the Achievement Test, where the N is 

29, there are 15 correlations significant at the 

© level and four more significant at the 5% 
SVel. Against the criterion of marksmanship 
there are only two correlations significant at 
the 1% level and four at the 5% level. It will 

© noted that except for one of the Rosenzweig 


categories, all the correlations which are signifi- 
cant against the marksmanship criterion are 
non-verbal aptitude measures. None of the 
individual correlations with service rating is 
statistically significant at the 5% level. 

The negative validities of the Police Interest 
Scale of the Strong Vocational Interest Blank 
against all criteria except marksmanship are 
perhaps surprising. However, the academy 
grade and the Achievement Test would prob- 
ably be somewhat artificial criteria of perfor- 
mance for policemen on whom the scale was 
developed. A tabulation of the Police Interest 
Scores of the men in the first class showed 
86% of them A or B+, with no score below 
B—. It is possible that in a greater range of 
interest scores, the test would have been 
predictive. 

The complete matrix of approximately 1000 
intercorrelations of all the predictor variables 
and the several criteria of success has been 
computed. Theintercorrelations of 22 selected 
predictor variables are presented in Table 3.3 
The Police Aptitude Test has a correlation of 
.53 with the Army General Classification Test 
in this restricted range. The intercorrelations 
of the non-verbal experimental tests ranged 
from .20 to .55, indicating that there is con- 
siderable over-lap among them. 

In Table 4, Beta weights, together with 
multiple correlations, are presented for the pre- 
diction of several criteria of success. For 
academic grades and achievement tests, the 
Wherry-Doolittle test selection method was 
employed. For marksmanship and service 
ratings ordinary regression weights were com- 
puted using the variables which appeared to 
be the most promising. 

It is immediately apparent that the com- 
binations of tests useful in prediction differ 
considerably from criterion to criterion. The 
Police Aptitude Test, the Army General Clas- 
sification Test and at least one of the non- 
verbal tests (Minnesota Paper Form Board or 
Bennett Mechanical Comprehension) appear 
in the regression equations predicting the 

3 To save space, Table 3 has been deposited with the 
American Documentation Institute and may be ordered 
as Document 2696 from American Documentation In- 
stitute, 1719 N Street, N.W., Washington 6, Di. C. 
remitting $.50 for microfilm (images 1 inch high on 


standard 35 mm. motion picture film) or $.50 for photo- 
copies (6 X 8 inches) readable without optical aid, 


94 Philip H. DuBois and Robert I. Watson 
Table 4 
Beta Weights and Multiple R's in Predicting Several Criteria of Success as Probationary 
Patrolman, St. Louis Police Department 
Academic Academic Achieve- -— 
Grade Grade ment Marksman- Service 
Class 1 Class 2 Test ship Rating 
(N = 72) (N= 57) (N= 129) (N=129)) (N= 129) 
Police Aptitude Test—Total JA 28 28 
Aussage 42 
Information and Judgment .08 
Army Gen. Classif. Test— Total Al 28 25 
AGCT—Blocks J 
Composition A3 
Figure Matching . 
Bennett Mech. Comp.—BB 412. 17 
Minn. Paper Form Board MB 20 RH i5 
Object-Aperture (Power) 13 
Strong—Occupation Level 23 
Rosenzweig 
Ego Defensiveness .20 
Intropunitiveness 48 
Multiple R .5g** .60** .59** 33* 29% 
** Significant at 1% level. 
* Significant at 5% level. 
“academic” criteria, i.e., academic grade in gression weights obtained with Class 2 and 


Class 1 and Class 2, and the achievement test. 
The composition test carries a weight in Class 
2 and the Strong-Occupational Level carries a 
weight for the achievement test. 

Marksmanship can be predicted to some de- 
gree from tests of a non-verbal sort, involving 
perception, space relations, and mechanical 
comprehension. 

Service rating, which is predicted least 
accurately, depends to a considerable degree 
upon personality predictors, with some weight 
being carried by the tests of spatial relations 
and information and judgment. For service 
rating the multiple of .29 is significant at the 
5% level of confidence, while all other multiple 
R's are significant at the 1% level. It may be 
that tests that will predict the service rating 
more adequately will have to be devised. 

A third class, numbering 36 cases and 
selected in part by means of a composite score 
based on the Police Aptitude Test, the AGCT, 
the Composition Test and the Cornell Word 
Form, recently completed a 12-weeks training 
course. For this group the correlation be- 
tween a weighted score based upon the re- 


academic grade was .62, or slightly better than 
anticipated. Of the 12 men who were in? the 
top third according to this composite score, 
83% had final grades above the median for the 
class, compared with 41% in the middle thir 

and only 25% in the lower third. Since in 
Class 3 the validity of the Police Aptitude Test 
was .51 and of the AGCT .55, it would appeat 
that the battery as a whole is somewhat more 
predictive in police training than a ment 
alertness test alone would be. 


Plans for Further Research 


Up to the present time only the first editio? 
of the Police Aptitude Test has been US?” 
Prior to its administration in connection W? 
Classes 1 and 2, none of the material had bec? 
tried out in any way. An item analysis of this 
test has been completed which is being us is 
a guide in preparing the next edition. pe 
number of the items have been found t° | 
unsuitable, it should be possible to improve 
validity of the instrument. dad 

Plans for further research include valida"? 


— 


The Selection of Patrolmen 95 


the original battery against new criteria as 
they become available, such as service ratings 
after a year and two years of service, remaining 
on the force over a period of years and special 
Teports of meritorious service. In the first 
three classes, the experimental tests adminis- 
tered were largely of the non-verbal sort, and 
from that study we can add two or three tests 
to the selection battery. Other types of tests 
are scheduled for experimental try-out from 
time to time, so that the selection battery can 
be modified as further validity information 
becomes available. 


Summary 


1. A battery of psychological selection and 
experimental tests was administered to two 
entering classes (N=72 and N=57, respec- 
tively) of probationary patrolmen at the St. 
Louis Police Academy. A third class (N=36) 
was used to test the predictive efficacy of the 
battery, Criterion variables were final grade 
In the Police Academy, an achievement test on 
knowledge of police practices, marksmanship 
and ratings by superior officers. 


2. A specially constructed Police Aptitude 
Test and the Army General Classification Test 
were consistently good predictors of academic 
performance and of the achievement test score. 
At least one non-verbal test, such as the 
Bennett Mechanical Comprehension Test, or 
the Minnesota Paper Form Board, is needed 
as a supplement in order to achieve a multiple 
coefficient of validity in the neighborhood 
of .60. g 

3. The best predictors of marksmanship are 
non-verbal aptitude tests, a combination of 
three of which yield a multiple coefficient of 
validity of .33. 

4. None of the tests explored has a signifi- 
cant correlation with rating on the job by 
superior officers. A combination of three non- 
verbal tests and two variables of the Rosen- 
zweig Picture Frustration Study yields a 
multiple coefficient of validity of .29 which is 
significant at the 5% level of confidence. 

5. In the third class (N—36) a predictive 
score, based upon the regression weights devel- 
oped with Class 2, had a correlation of .62 
with final grade in the Police Academy. 


Received August 3, 1949. 


Group Differences in Performance on the Meier Art Test 


E. Terry Prothro 


University of Tennessee 


and 


Harold 


T. Perry 


Louisiana Slate University 


The Meier Art Judgment Test attempts to 
measure an ability which is subject to con- 
siderable development through learning and 
experience. In this paper we have compared 
the scores of several groups of subjects in an 
effort to determine whether membership in 
those groups is associated with differences in 
test performance. If it is, then such infor- 
mation would be of importance to those 
counselors who utilize the test and to those 
students of aesthetics who wish to understand 
the nature and genesis of art judgment. 


Subjects 


A total of 410 subjects were used in this 
study. Of these, 109 were from a state uni- 
versity for whites, 100 were from a state 
university for Negroes, 100 were from the 
tenth and eleventh grades of a city high school 
for whites, and 101 were from the tenth and 
eleventh grades of a city high school for 
Negroes. All four schools are located in the 
same city in the state of Louisiana. Of the 
total number of subjects, 223 were males and 
187 were females. 


Procedure 


Selection of subjects was by school classes. 
Permission was obtained from the school 
authorities to test all of the members of a 
given class. Neither authorities nor students 
were given advance information on the exact 
nature of the study. The revised Meier Art 
Judgment Test booklets and forms were placed 
on the desks of all students in the class. 
Each class consisted of approximately 30 
students. 

Because the test is self-administering, few 
verbal instructions were given. It was ex- 
plained that the test was not a part of the 


school curriculum, and that performance on it 
would not affect school standing in any way- 
Only two persons objected to taking the test; 
both were excused. When the subjects filled 
out the personal data sheets, they were specifi- 
cally instructed to indicate the number of 
craftsmen in their ancestry by filling in the 
blanks at the bottom of that sheet. As far aS 
could be determined by the experimenters, 
test conditions were about the same for all 
subjects. 


Results 


Examination of the personal data sheets 
revealed that a majority of the subjects ha 
received some art training in elementary 
school, some had studied art in high school, 
but very few had had any training beyond that. 

The results are summarized in Table 1. 
can be seen that there are significant differences 
between mean scores of every pair of groups 
except the male-female groups. The similarity 
of mean scores of the male students and thé 
female students was present in each of the 
four schools. In the two Negro schools, the 
males did slightly better; in the two white 
schools, the females did slightly better. None 
of these differences was significant at the five 
percent level. The similarity of the sex group? 
on performance on the Meier Test is especi? 
notable in view of the fact that Carroll a? 
Eurich! found that women did better than me? 


on both the McAdory and the Meier-Seasho? 


tests. 


Y i x 0 
The superiority of university students l 


high school students and of white students 
Negro students may be attributable to di 
ences in the socio-economic status of 


"d 
1 Carroll, H. A., and Eurich, A. C. Group ait. 


ences in art judgment. Sch. & Soc., 1931, 34, 204- 


96 


ffer 
tbe 


Differences in Performance on Meier Art Test 97 
Table 1 
Comparison of Mean Scores on the Meier Art Test 
. S.E. of 
Group N Mean S.D. Difi. Difi. t 
White University 109 96.6 9.7 
White High School 100 89.7 104 m 138 UM 
Negro University 78.3 2 x L 1.50" 
sity 100 78.3 11.2 36 154 a 
Negro High School 101 74.7 10.5 m 284 
University Students 209 87 13.9 
" se T = v Ar d v] 
High School Students 201 2 12.8 zd han idi 
White Students 209 93.3 10.5 
z is 2 * 
Negro Students 201 76.5 11.0 198 Lo d cd 
Male Students 223 85.6 13.3 r 
M 2 i 1.36 
Female Students 187 84.3 14.1 r nos 
Students with Craftsmen 
in Ancestry 135 87.2 13.9 3.2 1.45 2.23* 
Students with No Craftsmen ` 
in Ancestry 275 84.0 13.4 


* l-score is significant at the five per cent level. 
* cuins E 
* t-score is significant at the one per cent level. 


Stoups. Rodgers? found that better environ- 
ments surrounded children selected as more 
Competent artistically. Meier’ has pointed 
out that some aspects of artistic ability are 
Subject to considerable development through 
eXperience, and we believe that the differences 
in daily experiences of members of different 
Socio-economic groups may be the basis for 
the differences in their scores. 

Our results confirm Meier’s thesis that there 
S some relationship between “craftsman 
heredity” and aesthetic judgment. The differ- 
ences between persons with craftsmen in their 
"DCestry and those with no known craftsmen 
In their ancestry are fairly small, but they are 
Significant at the five per cent level. It is 
Possible that the ancestry factor is less signifi- 
“nt for unselected school populations than for 
meat af eine Varum a de dalen Bach 

3298r., 1933, 45, No. 1, 95-107. — . 
summer? N. C. Factors in artistic ppëtude: Boal 
cho], Mon of a ten-year study of oe y. 3 

ogr., 1939, 51, No. 5, 140-15 


populations of art students. Whether the 
relationship is a function of selective recall 
about ancestry, of differences in home environ- 
ment, or of some “innate neuro-physical con- 
stitution” cannot be determined from the in- 
formation now available. 


Summary 


The revised Meier Art Judgment Test was 
given to 410 high school and college students 
in Louisiana. Comparison of mean scores re- 
vealed that the white students did significantly 
better than the Negro students and that college 
students did significantly better than high 
school students. It was suggested that these 
differences may be associated with differences 
in socio-economic status of the several groups. 
Those persons who had craftsmen in their 
ancestry did somewhat better than those who 
did not have craftsmen in their ancestry. 
There were no significant sex differences, 


Received June 10, 1949. 


Use of an Objectivity Key on a Short Industrial 
Personality Questionnaire 


Harold F. Rothe 


Stevenson, Jordan and Harrison, Inc., Chicago, Illinois 


Industrial and personnel psychologists have 
long been concerned with the problem of pos- 
sible “faking” by respondents to personality 
questionnaires. Bonnardel (1), in a discussion 
of industrial psychology in France, wrote that 
such questionnaires are little used because it is 
believed that candidates for jobs give the re- 
sponses most favorable to their obtaining the 
jobs. Giese and Christy, in a study reported 
by Tiffin (4), found that the Humm-Wads- 
worth Temperament Scale could be “faked” by 
college students to show higher normal (N) 
scores and less of undesirable characteristics. 
Longstaff (3) has recently shown that the 
Strong Vocational Interest Blank and the 
Kuder Preference Record could both be faked. 
He suggests that a key like the L or K keys of 
the Minnesota Multiphasic Personality In- 
ventory be incorporated in industrial question- 
naires, not because faking is known to exist, 
but because it might exist. 

The purpose of the present paper is to report 
some experience in the use of a key similar to 
the MMPI L key on a short personality ques- 
tionnaire that has been used in industrial 
practice for the past three years. This form 
has been developed for the use of the members 
of the psychological staff of a management con- 
sulting company and is not available for general 
distribution. Despite this restriction of the 
particular form, the experiences gained with it 
are believed to be helpful to other psychol- 
ogists, and hence this paper is presented. 


Description of the Form 


The form used here consists of fifty items to 
be answered by a YES or NO. It is self- 
administering, has no time limit, and can be 
used for individuals or groups. All items are 
stated in a positive manner and most of them 
refer to behavior rather than to feelings, atti- 
tudes, or interests. The items are scored on a 


number of keys that are industrially significant. 
One key is the so-called Objectivity key which 
has been patterned after the L scale of the 
MMPI. The other keys are based upon items 
which have been found valid, to some extent, 
on other published keys, although in each in- 
stance the wording has been slightly changed 
to avoid copyright infringement. Several keys 
have only three or four items, and others run as 
high as ten or twelve. For statistical reasons, 
only the longer keys have been analyzed and 
are mentioned in this paper. : 

In one very important respect this question 
naire differs from most standardized person- 
ality questionnaires. This form is primarily 
an interview aid. It is used by trained persons 
in conjunction with a personal history of each 
respondent, various tests, and an interview: 
Being used in this manner, the extreme short- 
ness (50 items) is an advantage rather than 2 
handicap. The interviewer can look over each 
statement and each answer, and can interview 
about each item if he so desires. At the same 
time the form can be scored to reveal patterns) 
and the interviewer can also use this patter 
information as a part of his interview- 

The longer keys are Emotionality, S 
Dominance, Drive, Extraversion, and O 
tivity. Since Social Dominance and E d 
version are highly correlated with each oth 1 
Gn this unrevised form), only the Soc? 
Dominance key is discussed here. The Emo 
tionality key has been found to be correlate 
with neuroticism and may be considered 9$ 
measure of that. The Drive key measures at 
aggressiveness with which the respon a. 
strives to reach various goals. It is planne ce 
present validation studies for some © wi 
keys in subsequent papers and according 5, 
that problem is not discussed further pr 

Reliability was determined by testing, A 
retesting a class of 44 college students W i 


ocial 


mi 


Use of Objectivity Key on a Short Questionnaire 99 


two-week intervening period. The reliability 
Coefficients (or better, the consistency coefh- 
cients) were .84 for the Objectivity, Emotional, 
and Social Dominance keys, and .69 for the 
Drive key. These compare very favorably 
with the test-retest reliabilities of .71 to .83 
reported for the MMPI (2, p. 3). The con- 
Sistencies reported in this paper are remarkably 
high considering the small number of items on 
each scale. The high degree of consistency is 
undoubtedly partly a function of the well- 
Controlled conditions under which the test- 
retest was made. 

The Objectivity key, as stated before, is 
Closely patterned after six of the items on the 
L scale of the MMPI. It would perhaps be 
More accurate to refer to these items as L 
Items so that the source of the origination 
Would be more clearly recognized and credited. 

he name was changed for practical purposes. 
Since the profile obtained from this form is 
frequently discussed with employers or pro- 
Spective employers, it may be an injustice to 
State that a respondent is “lying” or “faking.” 
Even though this may be true, insofar as this 
questionnaire is concerned, the fact may well 
be that the respondent is basically an honest 
Person. It might be doing him a great in- 
justice to let someone make conclusions about 
the extent to which he “lies” when that con- 
clusion would be based upon six items on a 
questionnaire. 


Relation of Objectivity to Emotionality 


Since the Emotionality key has been found 
to discriminate neurotics from the general 
Industrial population with a fairly high degree 
of accuracy, the primary analysis was made 
Upon the relation of the Emotionality scores 
to the Objectivity scores. The work of Giese 
and Christy led to the belief that less objective 
Persons might show lower Emotionality scores 

han highly objective persons. Conversely, 
Ighly objective persons might be expected to 
Show high Emotionality scores. This assump- 
tion is based upon the assumption that the 
umm-Wadsworth N score is reasonably 
Valid. It is also based upon the common- 


‘The writer wishes to thank Robert C. McWilliams 


er " H 1 n" ri 
meNOrthwestern University for conducting this exper! 


sense observation that if someone is going to 
fake a questionnaire he will fake items to make 
him look better adjusted or better qualitied 
for the job, as he interprets these items. 

A sample of 525 persons was selected. The 
selection was made on the basis of their Ob- 
jectivity scores; 144 persons scoring 0 or 1, 237 
persons scoring 2 or 3, and 144 persons scoring 
4, 5, or 6. The sample was selected from 
master data sheets on which are recorded scores 
of all persons taking this and other tests. 
These persons were all industrial workers in 
various job classifications. For the most part 
they were supervisors, or supervisory candi- 
dates, in both shop and office jobs. Beginning 
with the most recent entry, the first 144 persons 
who scored 0 or 1 in Objectivity were located, 
and their Emotionality scores were plotted. 
In a similar manner the other two groups were 
located and plotted. 

The relation between Emotionality and Ob- 
jectivity could be determined by correlational 
techniques, but have not. This is because of 
the writer’s belief that a table of frequency 
distributions would show the same thing 
graphically and in a more meaningful manner. 
Table 1 contains the distribution of Emotion- 


Table 1 


Distributions of Emotionality Scores Made by 
525 Persons Obtaining Different 
Objectivity Scores 


Number of Persons with Objectivity 


Emotion- Scores of 
ality 
Score Oorl 2or3 4, 5or6 
0 5 2 0 
1 17 11 4 
2 25 20 S 
3 29 41 17 
4 24 38 15 
5 26 37 21 
6 10 34 20 
1 2 24 20 
8 4 20 18 
9 2 8 8 
10 2 10 
11 4 
12 1 
13 1 
14 
Mean 35 48 5.7 
S.D. 19 2.1 34 


100 


Table 2 


i i i s Made 
istributions of Social Dominance Scores Y 
"e by 525 Persons Obtaining Different 
Objectivity Scores 


i D Persons with Objectivity 
San ssid Scores of 
Score Oorl 2or3 4, Sor 6 
0 1 0 0 
1 0 1 0 
2 2 3 1 
3 3 10 13 
4 7 18 il 
3 14 39 26 
6 36 40 37 
7 39 68 29 
8 38 45 22 
9 4 13 5 
10 0 0 0 
Mean 6.5 6.3 6.0 
S.D. 1.5 1.6 1.6 


ality scores for these three groups (High, Low, 
and Medium) selected on the basis of their 
Objectivity scores. 

Table 1 shows that persons with high Ob- 
jectivity scores tend to have high Emotionality 
scores, and vice versa. The critical ratio be- 
tween low and medium is 2.4; between medium 
and high is 1.3; and between low and high is 
3.3. The first two of these differences may be 
said to be of little statistical significance. 

It is perhaps more meaningful, from an in- 
terviewer’s point of view, to look at the dis- 
tributions. A respondent who scores 7 on 
the Emotionality key may be considered as 
probably emotional or neurotic if his Ob- 
jectivity score is Oor 1. But if his Objectivity 
Score is 5, that person would be practically 
average in Emotionality, and the interviewer 
would not expect to find symptoms of neuroti- 
cism. The interviewer could profitably spend 
more time interviewing in other areas with this 
highly objective respondent. 

It is possible that the inverse relationship of 
Objectivity to Emotionality that has been 
found here is independent of "faking," ie., 
that non-objective persons are actually less 
emotional or less neurotic than objective per- 
sons. Data exist, and it is hoped to publish 
them shortly, to rule out this possibility. 


Harold F. Rothe 


Relation of Objectivity to Social Dominance 
and to Drive 


There is, as shown in Table 1, a relation be- 
tween Objectivity scores and Emotionality 
scores, for this industrial sample. This raises 
the question of the extent to which scores on 
other scales vary with Objectivity scores for 
these persons. Two other long scales of this 
questionnaire have been selected for further 
analysis. 

Table 2 contains the distributions of the 
scores of these same three groups on the Social 
Dominance scale. There is very little differ- 
ence among the Low, Medium, and High Ob- 
jectivity groups in terms of their Social Domi- 
nance. The critical ratios are: 44 between Low 
and Medium; .70 between Medium and High; 
and 1.08 between Low and High. None of 
these differences is significant. 

Table 3 contains the distributions of the 
same three groups on the Drive scale. Agam 
the differences between the groups are mM- 
significant with critical ratios of .75 between 
Low and Medium; .68 between Medium anc 
High; and 1.30 between Low and High. 

The results presented in these three tables 
are based upon an industrial sample compose : 
of adults, most of whom have had supervisory 
experience of one kind or another. The samp'© 
is uncontrolled in terms of intelligence and it 15 


Table 3 


Distributions of Drive Scores Made by 525 Person 
Obtaining Different Objectivity Scores 


A 
nan 
Number of Persons with ObjectiV! 
Scores of "m 
Drive i 
Score Oor1 2or3 4,50 
0 0 0 i 
1 0 1 
2 1 1 : 
3 6 10 A 
4 20 25 
5 38 53 i» 
6 35 ot 4 
T 34 40 30 
8 8 33 23 
9 2 12 : 
10 0 i 2 
Mean 5.6 6.0 3 
S.D 13 i 16 


Use of Objectivity Key on a Short Questionnaire i 101 


assumed that there are no differences among 
the three groups in that respect. It does not 
necessarily follow that similar results would be 
obtained from samples of a different population. 

To be more explicit, it is possible that other 
Sroups might show a variation in Social Domi- 
nance or Drive scores, depending upon Objec- 
tivity scores. It is reasonable to assume that 
the persons analyzed here, having had super- 
Visory experience, might truly rank fairly high 
in both Social Dominance and Drive and hence 
not attempt to “fake” higher scores in those 
respects, Tt is also possible, however, that 
these persons could not “fake” either of those 
keys, even though they tried to do so. This 
Possibility seems less likely than does the 
Previous one, 


Summary and Discussion 


Data have been presented to show how the 
Emotionality scores of persons on a short 
industria] personality questionnaire may vary 
with the Objectivity, or L scale, scores of those 
Persons. Scores on Social Dominance and 
Drive scales do not vary significantly with 

bjectivity scores. This means that persons 
may "fake" a better adjustment than they 
Actually have; it also means that the inter- 
Viewer can adjust for the faking by inter- 
Preting an Emotionality score on norms based 
“pon persons of similar Objectivity scores. 

his paper accordingly illustrates a method 

for Correcting for faking on short industrial 
Personality questionnaires. This is essentially 
€ same technique as that of the L scale in the 

MPI, and the actual items are similar to 
MMPI items (2). There is, however, one 
Mteresting difference in terms of interpretation. 

One aspect of industrial psychology is the 
testing and interviewing of the key personnel 
within an organization to determine the extent 
to which those persons are adequately qualified 
°r their positions. It is reasonable to assume 
that these persons will “try their best” on the 
Various tests and questionnaires. Indeed, it 


is the interviewer’s responsibility to see that 
they do. Under these conditions it is only to 
be expected that a small degree of “faking” 
or of “stretching the point” will prevail. The 
interviewer expects this and includes it as a 
part of his frame of reference. 

Occasionally, however, an extremely ob- 
jective, or perhaps an extremely naive, person 
is encountered who is frank about himself to 
the point where he penalizes himself on a 
standardized questionnaire. Such a person 
might admit that he loses his temper, for 
example, even though he realizes this state- 
ment may be weighed against him. 

Probably the chief use of an Objectivity key 
such as the one described in this paper is to 
protect such persons. That is, the problem 
is not one of locating “fakers” and possibly 
adjusting their results downward. Rather, 
the problem is one of protecting the “overly 
honest” persons and adjusting their scores 
upwards, so that they may be compared favor- 
ably with their more average fellows. An ex- 
ample of this has previously been given: A high 
Emotionality, or neurotic, score may not be 
serious if the respondent is highly objective; 
the same score may be very extreme if the 
respondent is non-objective. The interviewer 
must include the concept of objectivity in his 
interpretation to obtain a correct description of 
the respondent. One method for doing so is to 
maintain different norms for different objec- 
tivity groups, as has been described here. 


Received August 9, 1949. 


References 


1. Bonnardel, R. Industrial psychology in France, 
Pers. Psychol., 1949, 2, 47-68. 
. Hathaway, S. R., and McKinley, J.C. Manual for 
the Minnesota Multiphasic Personality 1; nventory, 
New York: The Psychological Corporation, 1943. 
3. Longstaff, H. P. Fakability of the Strong Interest 
Blank and the Kuder Preference Record. J, 
appl. Psychol., 1948, 32, 360-369. 
4. Tiffin, J. Industrial psychology. New York: Pren- 
tice-Hall, Inc., 1947, rev. ed., pp. 170-171. 


The Application of Weber's Law to Job Evaluation Estimates 


Edward N. Hay 
Edward N. Hay and Associates, Inc., Philadelphia, Pa. 


One method of job evaluation which is gradu- 
ally coming into more general use is the method 
known as factor comparison, devised by 
Eugene J. Benge.’ Benge’s method is similar 
to other methods of job evaluation such as the 
predetermined point plan in that the judgments 
are expressed in points, and both methods 
evaluate factors of the job under study rather 
than the job as a whole. The scales used by 
the two methods are quite different. In pre- 
determined point plans certain qualities are 
identified, on the basis of a study of the jobs, 
as necessary for their performance. Scales 
are then constructed and given point values a 
priori to measure these requirements. Factor 
comparison scales, on the other hand, are con- 
structed by dividing the salaries of existing jobs 
among the factors. Nowadays, however, the 
per cent method? of creating factor comparison 
key scales is frequently used. This method 
expresses as percentages, (1) the magnitude of 
each factor of a job in relation to the other 
factors in that job, and (2) the relative magni- 
tude of a factor in one job to the same factor 
in the other key jobs. 

Job evaluation by any method is dependent 
on the judgment of the evaluators. In point 
methods much, if not most, of the judgment is 
in the factors; their selection, definition, point- 
ing, and weighing. In factor comparison the 
judgment is less in these things, and more in 
the comparison of factors of jobs being evalu- 
ated. The evaluators compare one job with 
another in respect of each factor rather than 
match a job against a description on a scale. 
The jobs are familiar to all members of the 
evaluation committee because the original 

‘Described in Industrial Relations Magazine, Feb., 
Mar., and Apr., 1932 (now out of print), under the 
title “Gauging the job’s worth.” The mechanics of 
the method are also described in Benge, Burke, and 
Hay, Manual of job evaluation. Harper and Bros., 


1941, in chap. IV, pp. 41-56, which is a reprint of an 
article by E. N. Hay, Arranging the right pay, Per- 
sonnel J., 1939, 17, 1-7. 

* Edward N. Hay. Creating factor comparison key 


(oa es o Cent Method. J. appl. Psychol., 


scale of jobs is created by the committee itself. 
Point methods tend to have steps of irregular 
spacing, while factor comparison evaluation 
scales have intervals that are equal in “sensed 
units of difference." 

Not much has been written about the theory 
underlying the use of this method. It has been 
found, however, that the judgments expressed 
in factor comparison job rating seem to behave 
in accordance with Weber's Law? By way of 
illustration: an increase in job difficulty at @ 
high level calls for a larger salary "raise" than 
at a low level. For example, at the lower enc 
of the scale two jobs of just noticeable differ- | 
ence in difficulty might have salaries of $100 | 
and $120 a month respectively. At a higher 
level two jobs of just noticeable difference m 
difficulty might be paid $1000 and $1200 ie 
spectively, but certainly not $1000 and $102". 
So arithmetic intervals in job difficulty a^ 
matched by logarithmic increments in salary 

One of the steps in the factor compare, 
method calls for assigning a percentage va p? 
to each of the factors in a job (factors which Wr 
common to all jobs) in such a way that th 
total of all the factor values equals 100%; 
For example, the job of machine bookkeepe 


cigs ittee 
was selected as a “key job,” and the committ 


Kii: 
made the following assignment of percenta£ 


values to the five factors as representing 1” 6 
judgment the relative importance of M 
factors in that job: Mental Requirement» , e 
Skill Requirements, 27; Physical Regui g 
ments, 17; Responsibility, 27; and Work? 
Conditions, 11. 


structed with steps of 15 per cent increm®! 
and uniform differences in job di s 
Table 1 shows steps for the factor of ski 
is representative of other factor scales Cte” 
by evaluation committees, art t 
Additional jobs that are to be evaluated ge 
_*See H. E. Garrett. Great experiments in psychol d 
New Vork: Century Co., 1930, pp. 268-274; aP aris” 


ward N. Hay. Characteristics of factor com! 
job evaluation. Personnel, 1946, 22, 370-375. 


102 


Du 


Weber’s Law and Job Evaluation Estimates 103 


Table 1 


Scale for Evaluating Skill Factor 


Skill Value 
in Points 


Job Title (15% intervals) 


Supervisor of Construction 93 
Credit Analyst d 
Supervisor Trust Accounting 70 
Head Bookkeeper 8t 
Cost Accounting Clerk 53 
Real Estate Maintenance Inspector 47 
Senior Collection Clerk E: 
Junior Branch Teller xs 
Transcribing Machine Operator 31 
Telephone Operator « 87 
Stock Clerk A 
File Clerk a 
Note Counter ay 
Elevator € )perator 5 
Counter Girl 3 
Cleaner H 

10 


Bus Boy 


then compared for their skill requirement with 
the jobs on the scale shown in Table 1. For 
example, the job of Multigraph Operator is 
Studied for the skill requirement as outlined 
in the factor definitions. Examination indi- 
Cates that the level of skill required for this job 
is most nearly the same as for Telephone 
Operator, and that it requires more skill than 
Stock Clerk and less than Transcribing Ma- 
chine Operator. This three-level comparison 
Places the skill requirement at the level of 
Telephone Operator, which gives it 27 points. 
In the same manner the requirement of the 
new job of Multigraph Operator is compared 
With the jobs on the other factor scales and is 
given points accordingly. The sum total of 
the point values for all factors is the point value 


for the job asa whole. By entering a specially 
constructed table of point and salary values 
we find the salary grade appropriate for the 
new job of Multigraph Operator. 

In factor comparison job evaluation, differ- 
ences in job difficulty at the threshold of dis- 
crimination increase by uniformly equal in- 
crements or steps while their salaries increase 
by a uniform ratio. In working with job 
evaluation data the writer noticed that Weber's 
Law seemed to apply and decided to experi- 
ment to see if the value of the constant in 


dR 
Weber's Law, = C, could be determined. 


As expressed by Wundt, “all judgments are 
governed by the general principle of relativity: 
changes are estimated in terms of the thing 
which has been changed." Weber put it this 
way: “in comparing objects we perceive not the 
actual difference between them but the ratio 
of this difference to the magnitudes of the two 
objects compared." That is to say, the ob- 
served difference between two objects is not 
absolute and independent of the objects them- 
selves, but is relative to their size and is a 
constant fraction of one of them. 

This constant fraction, or “difference limen," 
must be discovered by experiment, and is 
usually the magnitude at which 75% of the 
judgments agree. Weber found that for 
weights lifted by hand it was 1/30 and for the 
length of lines 1/100. In factor comparison 
job evaluation it is about 1/7 or 15 per cent. 
Data presented here show that “just noticeable 
differences" which were correctly discriminated 
by trained job raters about three-fourths of 
the time were 15 per cent apart in salary value 
of the factor intervals. 

In creating the five factor scales in one 
company, by apportioning the values of the 


Table 2 


Agreement Between Job Evalu: 


ation Estimates by All Evaluators of a Committee and the 
Final Values they Adopted 


Number of: F 7 
Evalu- Judg- 
Co. Industry Jobs Factors ators ments % Agreement 
i 91 5 9 4095 75 
1 (1947) Bank 39 5 9 4005 73 
1 (1946) Bank " Ej 
: 300 3 6 5400 78 
2 (1949) Life Ins. i06 3 8 2544 ; 
2 (1947) Life Ins. 2 


104 


Edward N. Hay 


Table 3 


i i y All Ev: f a Committee and the 
À t Between Job Evaluation Estimates by All Evaluators o 
itil J Final Values they Adopted 


Number of: 
Evalu- Judg- 

Co. Industry Jobs Factors ators ments % Agreement 

1 Bank 112 5 10 5,600 66 

6 Utility 382 5 8 15,280 62 

8 Metal Mig. 356 5 8 14,240 63 

10 Life Ins. 140 5 7 4,900 ót 

5 5 


11 Textile 60 


1,500 55 


“key jobs” to these five factors, the intervals 
on the scale were adjusted to uniform per- 
centage increments, in accordance with the 
evidence that judgments formed by the use 
of these scales followed Weber’s Law. At that 
time it was thought that the different factors 
required different rates of progression. For 
two factors the progression rate was set at 15 
per cent, for another at 11} per cent, for 
another at 20 per cent, and for the fifth at 50 
per cent. After working with these scales for 
a number of years it became evident that the 
first two, using 15 per cent progression, were 
more satisfactory than the others. The last 
two scales mentioned appeared to have too 
coarse intervals, and these scales were re- 
designed at 15 per cent intervals. 

Table 2 shows the results for committees of 
job evaluators in different companies, all of 
whom had had more than one year’s experience 
using factor comparison job rating. Each 
member of the committee works independently 
and submits his ratings in writing. The next 
day a meeting of all evaluators is held and 
each member is given a summary sheet showing 
the evaluations submitted by all participants. 
Differences are discussed and agreement is 
reached on a satisfactory set of ratings. The 
number of agreements and of differences be- 
tween the ratings by each person and the final 
ratings that are adopted by the committee are 
tallied, which gives percentages of agreement 
for each member. Table 2 shows the average 
percentages of agreement for the members of 
the committees. In the company referred to 
in the first line of Table 2, for example, 91 jobs 
were rated in relation to each of 5 factors by 
each of the 9 members of the evaluation com- 
mittee. This process involved 4095 separate 


judgments. In 75% of these judgments the 
rating of the individual evaluators agreed with 
the final value adopted by the committee. 
Table 2 shows that experienced judges per- 
ceived the same differences in job factors about 
75 per cent of the time. Since these scales 
were constructed with 15 per cent increments, 
these data tend to confirm the finding that 15 
per cent was about the right difference in value 
of the factors. : 

Table 3 shows the data from five companies 
in different industries collected during their 
training periods. These data represent the 
judgments of relatively inexperineced com 
mittes. The percentages of agreement are 
smaller, as might be expected, ranging from 
55 to 66 per cent as against 72 to 75 per cent 
in the cases of the experienced committees 
shown in Table 2. 


Summary 


In factor comparison job evaluation, jobs 
are "rated" for the assignment of salary or 
wage rates by comparing them, one at a time 
with other jobs with respect to cach of sever? 
factors or elements common to all job: 
These factor values constitute “scales,” each 9 
which consists of an ascending series of pure 
representing the relative salary value of tha 
factor for a job. This ascending series A 
values progresses by intervals of appro* 
mately 15 per cent. The process of comparing 
one job with another by factors seems to follo” 
Weber’s Law, in which the observed difference? 
are a constant percentage of one of the Jo 
factors. 

Received November 23, 1949. 

Early publication. 


Construction and Use of Verbal Analogy Items 


Abraham S. Levine 


Human Resources Research Center, Lackland Air Force Base, San Antonio, Texas 


World War I saw the emergence of the 
verbal analogy item in its present form, when it 
Was included as a subtest in the Army Alpha. 
Yerkes (13) credited Thurstone, Otis and 
Bingham for the development of this type of 
ltem. Since then, the verbal analogy item 
has been incorporated into a host of widely 
used tests including the ACE Psychological 
Examination, National Intelligence Test, Ohio 
State Psychological Test and the 1937 Revised 
Stanford-Binet Test. Although it has been 
commonly used as a subtest, the verbal analogy 
‘tem constitutes the sole type of item in the 
Miller Analogies Test, Form G, which is now 
being widely used as a selection instrument for 
Staduate students. 
The analogy form of question is usually of 
the recognition type consisting of four or five 
alternative answers, only one of which is 
Correct, and the examinee is required to choose 
the correct answer. The following item is 
representative of this form: 
Circle is to square as sphere is to (circum- 
ference, cube, round, corners, ball). 
Although few test makers who employ ana- 
logies have been concerned with the theoretical 
basis for this item, justification for it may be 
found in the last two of Spearman’s three basic 
aws of cognition dealing with the eduction of 
relations and the use of the educed relation to 
discover the correlate. Spearman (9, p. 180) 
describes the process as follows: “First a pair of 
Ideas is given between which a relation has to 
€ cognized; and then this relation has to be 
Applied to a third idea, so as to generate a 
Ourth one called the correlate." This is in 
effect what is demanded by a good analogy 
Item, 
Several kinds of content have been intro- 
ure into the analogies form. Thus, in addi- 

on to verbal analogies, there are tests made 
"IP of spatial or figure analogies and arithmetic 
Analogies, Perhaps the most unique departure 
‘om the traditional type of analogy item was 
Put forth in a test discussed by El Koussy (3). 


105 


He suggested that various data indicated that 
g may best be measured by tests of primary 
perception. In order to eliminate the effect 
of spatial as well as verbal and number factors 
he advocated the use of a new test, Greys 
Analogy Test, which employs varying intensi- 
ties of grey in analogy form. 

However intriguing these other analogy 
ventures may be, the scope of this paper has 
been limited to verbal analogies since the avail- 
able published data on the other forms is as 
yet too meager to permit the adequate formula- 
tion of rules for construction and use. 

The general problem of validity of verbal 
analogy tests may be broken down into three 
aspects in terms of the research literature on 
the problem. First, the question may be 
asked as to how well analogies intercorrelate 
with other verbal tests, such as vocabulary 
and opposites, in order to obtain some estimate 
regarding the saturation of verbal analogies 
with verbal factors that are measured by other 
kinds of items. Perhaps the most important 
aspect of the validity problem is the question of 
intercorrelation of analogies with tests of gen- 
eral ability. In other words, will a test of 
verbal analogies by itself give you a good 
estimate of general academic ability? The 
factorial composition of verbal analogy tests 
may be regarded as still another aspect of the 
validity problem. 

With regard to the problem of intercorrela- 
tions of verbal analogies with other verbal 
tests, there is some apparently contradictory 
evidence. Kelley (6) cites an early research 
by Agnes Rogers using as subjects 61 entering 
high school girls with a mean age of 14.6 and 
an age range of 12-10 to 16-11. Rogers’ data 
showed that the correlation between analogies 
and logical opposites was only +.26. Other 
low or moderate intercorrelations were ob- 
tained by Weisenburg, Roe and McBride (11) 
employing 70 patients from orthopedic and 
surgical wards of three hospitsls in Philadelphia 
carefully selected so as to be a fairly representa- 


106 


tive sample of the city's population. All sub- 
jects used English as their native tongue; ages 
ranged from 10 to 59. A correlation of +.52 
was obtained between printed verbal analogy 
tests and the Stanford-Binet Vocabulary sub- 
test, and one of +.31 was reported with Thorn- 
dike word knowledge. Much higher correla- 
tions are reported by both Schneck (8) and 
Anastasi (1) who both used analogies tests on 
college students. Schneck reports a +.68 
with Vocabulary and a +.71 with Opposites. 
Anastasi obtained a +.65 with Vocabulary. 
A plausible interpretation aimed at reconcilia- 
tion of these apparently contradictory results 
would be to regard verbal analogy tests as 
measuring primarily a general ability akin to 
Spearmanian g at the lower ages and IQ. 
levels, and verbal ability at the higher levels. 
Thus it is possible that difficulty is obtained at 
the lower levels largely by relational com- 
plexity, and at the upper levels most of the 
variance is attributable to vocabulary com- 
plexity. This point bears investigation and it 
does not gainsay the possibility of adding com- 
plexity even on the college and graduate 
student levels by the addition of relational 
complexity rather than by abstruse vocabulary 
as has been more often the case. Indeed, 
Heim (5) and Hebb (4) independently have 
attempted to develop analogy tests whose 
upper sensitivity would be due to something 
other than vocabulary difficulty. 

To what extent do verbal analogies measure 
what tests of general ability measure? Ter- 
man and Merrill (10) included analogy items 
in the 1937 Revised Stanford-Binet, declaring 
it to be one of the most satisfactory tests since 
it correlated uniformly high (average +.82) at 
each age level with the composite score. In 
the previously mentioned Weisenburg, Roe 
and McBride study (11) the following correla- 
tions were obtained with the printed analogies 
test: Stanford-Binet +.72, Completion Beta 
+.63, Pintner Nonlanguage +.55, Porteus 
Maze +.62. Such evidence indicates that 
verbal analogies, at least for a heterogeneous 
population, measures largely what more gen- 
eral, even non-verbal, tests measure. 

Whatever evidence there is available on the 
problem of factorial composition of verbal anal- 
ogies points to at least one verbal factor and 
often another factor such as reasoning or deduc- 


Abraham S. Levine 


The results of these factor analyses are 
to dif- 


tion. 
not in complete agreement probably d 
ferences in the tests and populati 
well as methods of analysis and interpretation. 
However, for the most part, the factorial studies 
tend to agree with the logical analvsis into 
verbal and reasoning components. 

The reliability coefficients reported for 
analogy tests tend to run somewhat lower than 
for other verbal tests. They characteristically 
bounce around in the .70’s and .80's as com- 
pared with the .80's and .90’s obtained for 
such tests as vocabulary, opposites, similarities, 
completion, etc. A combination oi some OY 
all of the following factors may tend to explain 
this finding as well as point to rules ior the 
appropriate use of the analogy item: : 

1. Greater improvement through practice 15 
reported for analogy tests than for other verbal 
tests (2). 

2. A greater number of zero scores are Te- 
ported for analogy subtests than for other 
subtests, particularly for younger and older 
subjects and subjects with low educational 
attainment, indicating perhaps greater inherent 
difficulty in wnderstanding the analogy ques 
tion (2, 13). 

3. Analogy subtests tend to be either some- 
what shorter or the time allowance is such that 
the average examinee tends to complete fewer 
items. 

4. The analogy form of question is perhaps 
subject to greater ambiguity and equivocalitY 
of interpretation than other types of items 
An important point stressed by Paterson (1) 
and apparently ignored by some analogy item 
makers, is that the first two terms shou B 
always bear the same relationship to each 
other as the third and fourth term beat 
Mixed and reversed analogies contribute to th 
ambiguity of the item and convert what wou K 
otherwise be an exercise in relational thinking 
to a sort of guessing game. 

The discrimination of a verbal analogy 
may be improved by increasing the degree ? 
association between the wrong alternatives 27 E 
the third term in the item. Zirkle (12) repo" 
a study where the distractors in the an M 
items were subjectively divided into thr 
groups on the basis of degree of closeness 
association with the third word regardless 4 
the first two words, and analysis showe thé 


item 


Construction and Use of Verbal Analogy Items 107 


as the distractors went from little association 
to greatest association there was an increased 
tendency to choose them. This tendency was 
greatest for persons scoring in the lowest 27% 
and less for persons scoring in the highest 27%. 
On ihe basis of the foregoing summary of 
empirical findings and logical considerations, 
the following tentative rules may be formulated 
the construction and use of verbal analogy 
ms: 


1. Sufficient fore-exercises should be in- 
cluded to insure adequate comprehension of 
the analogy form of question. Such a pre- 
Cautionary measure will tend to reduce sharply 
fs number of zero scores and minimize the 
amount of practice effect. 

E Items should be arranged in order of 
difficulty with the first few items constituting 
Sraduated practice in analogy thinking as well 
as serving as shock absorbers. 

3. Avoid mixed and reversed analogies by 
making sure that the first two terms are re- 
lated to each other in the same fashion as are 
the last two terms. 

4. Distractors should be made as plausible 
as possible, and wherever feasible, at least one 
Of the distractors should be closely associated 
with the third term in meaning. 
ne The test should include enough items and 

€ time limit should be long enough to enable 
examinees to answer sufficient items to insure 
adequate reliability. It is probably desirable 
to use verbal analogy items in a power test 
rather than a speed test since the variable of 
Set may play a more important role in under- 
Standing what is required than in other types 
ot items, 

T: Since it is more difficult to foresee possible 
sele tations of analogy questions, a Tigi 
ie on the basis of item analysis data is 
th aps even more important for analogies 

En for other types of items. , 
A - It is perhaps more desirable in measuring 
2 neral ability to attain difficulty by means of 
9mplexity of relationship rather than through 


the use of uncommon vocabulary. It should 
be pointed out, however, that the superiority 
of the former type of item as a predictor of 
academic success has not as yet been demon- 
strated. 

8. For optimal effectiveness, the use of the 
analogy item may best be restricted to a 
college or professional school level of young 
adults. Younger and older subjects, as well 
as less intelligent and poorly educated ex- 
aminees, tend to experience the most difficulty 
in understanding the analogy form of question 
and therefore it should be used with great 
caution for these segments of the general 


population. 
Received June 10, 1949. 


References 


1. Anastasi, Anne. Further studies on memory fac- 
tor. Arch. Psychol., 1932, No. 142, 1-60. 
. De Weerdt, E. H. The transfer effect of practice 
in related functions upon a group intelligence 
test. Sch. & Soc., 1927, 25, 438-440. 
3. El Koussy, A. H. A note on the Greys Analogy 
Test. Brit. J. educ. Psychol., 1934, 4, 204-205. 

4. Hebb, D. O. Verbal test material independent of 
special vocabulary difficulty. J. educ. Psychol., 
1942, 33, 691-696. 

5. Heim, A.W. An attempt to test high grade intelli- 
gence. Brit. J. Psychol., 1947, 37, 70-81. 

6. Kelley, T. L. Crossroads in the mind of man. 
Stanford U., California: Stanford U. Press, 1928. 

7. Paterson, D. G. Preparation and use of new-lype 
examinations. New York: World Book Co., 
1926. 

. Schneck, M. R. 
numerical abilities. 
107, 1-49. 

9. Spearman, C. Abilities of man. New York: Mac- 

millan Co., 1927. 
Terman, L. M., and Merrill, M. A. Measuring 


N’ 


The measurement of verbal and 
Arch. Psychol., 1929, No. 


oo 


10. 
intelligence. New York: Houghton Mifflin Co., 
1937. 

11. Weisenburg, T., Roe, Anne, and McBride, Kath- 
arine E. Adult intelligence. New York: Com- 
monwealth Fund, 1936. 

12. Zirkle, G. A. An analytic study of the multiple 
choice analogies test item. J. educ. Psychol., 
1946, 37, 427-435. 

13. Memoirs of the National Academy of Sciences, V. 15. 


Washington: Govt. Printing Office, 1921, 


Spread of Vocational Interests and General Adjustment Status * 


Samuel F. Klugman 
Veterans Administration, Philadelphia, Pa. 


Opinions as to the relationship between 
personality inventory scores and vocational 
interest scores vary from one extreme to the 
other. On one hand, for example, there are 
these statements by Tyler and Fowler, re- 
spectively. The former states: “Scares indi- 
cating neurotic tendencies show no appreciable 
relationship to any kind of interest scores” 
(20, p. 67). The latter writes: “General agree- 
ment seems to exist, too, that interest test 
scores are not a dependable basis for conclu- 
sions about a student's attitudes and adjust- 
ment" (9, p. 26). On the other hand, Darley 
claims, “Men with primary interest patterns 
in technical occupations . . . seem to have 
markedly poorer social adjustments . . . (and) 
they show a tendency for consistently better 
home, health, and emotional adjustments" (7, 
p. 470-472). In a similar manner he differ- 
entiates individuals with Verbal and Social 
Service A patterns from those with Business 
Contact and Social Service B patterns. Ata 
later date, based on 80 maladjusted adult 
cases, he maintains that the data fit “into the 
hypothesis of interests as an outgrowth of 
personality" (8, p. 58). Sarbin and Berdie 
found sufficient relationship to say, “It is 
possible in the absence of other interest meas- 
urement to use the Allport-Vernon Scale (a 
personality inventory) to approximate certain 
occupational interest types as measured by 
the Strong test" (15, p. 296). Between these 
two extremes, is this statement by Alteneder: 
"Relationship between vocational interests 
and intelligence and personality traits indicate 
some interesting trends” (2, p. 459).  Tussing 
also declares, “From this study it does appear 
that a prediction of self-confidence and of 

* Published with the permission of the Chief Medical 
Director, Department of Medicine and Surgery, Vet- 
erans Administration, who assumes no responsibility 
for the opinions nor conclusions by the author. The 
writer wishes to acknowledge his indebtedness to Dr. 
Hans C. Gordon, Division of Educational Research of 
the Philadelphia School System, for reading this manu- 
script and offering several constructive suggestions. 


He assumes, of course, no responsibility for any of the 
writer's statements. 


108 


sociability can be made with a fair amount ot 
accuracy...” (19, p. 72). However, he 
found no relationship between Strong’s Scales 
and Bell’s Home, Health, and Emotional Ad- 
justment Areas. Strong, too, feels that "in- 
terests as measured today are related some- 
what to certain personality factors and atti- 
tudes" (16, p. 341). 

It is fairly obvious that there is a need for 
further investigation of this problem. Fur- 
thermore, the test which has been used most 
frequently is the Strong Vocational Interest 
Blank. Perhaps the use of another test would 
yield more consistent results, and toward this 
end the Kuder Preference Record was em- 
ployed in this study. 


Statement of the Problem 


The chief purpose of this study was r 
determine whether any relationship ate 
among veterans, between "general adjustmen 
status" as measured by the total score of the 
Bell Adjustment Inventory (3, p. 1) a? 
"spread of vocational interests" based 0? 
Kuder Preference Record profile scores (13). 

So far as this writer is aware, “spread 9 
interest" as used in this investigation ms 
new concept and is not to be confused ve 
Berdie’s “range of interests” which is m 
mined merely by counting the number of 2 
terests claimed from a list of 22 recreation” 
items (4, p. 269). Essentially, “spread 
interest” is similar to Wechsler’s technique J? 
determining the significance of a varying = 
by comparing it with the individual's sub-t€ 


mean (21, pp. 148-149). In this instance 


however, it refers to the sum of the deviatio” 
of all scores from the individual's mean bas© 
on the assignment of standard scores (as P. 
Bingham, Chapter XIX) to the nine percen ane 
scores of the Kuder Preference Record. 
greater the spread score the more definite ^. 
one’s likes and dislikes. Stated in anot 
way, one who has a profile with only very E 


e 
; B 4 of 
and very low scores is à person with a ™ 


Spread of Vocational Interests 


definite pattern of interests than one whose 
Scores fall closer to the mean. 

Besides the null hypothesis, there are at 
east two others which may be stated: 


1. The more definite one’s likes and dislikes 
are, resulting in a greater spread of one’s 
Vocational interest pattern (i.e., the greater the 
Sum of the deviations from his mean), the 
better adjusted is he. 

_ 2. The more one's interests are distributed 
ul a horizontal manner over a wide area, em- 
bracing many fields (not liking or disliking 
many fields with too much intensity, resulting 
in smaller deviations), the better adjusted is he. 

In addition to the purpose stated above, this 
Study seeks to determine whether such factors 
aS age, education, and intelligence, as well as 
Spread of vocational interests and general ad- 
Justment status, are related to the various 
Occupational interest areas measured by the 

uder Preference Record. 


Procedure 

During the year 1947, by random selection, 
the records of 108 veterans who had applied, or 
Nad been sent by the Reginal Office, for advise- 
Ment to the Veterans Administration-University 
ot Pennsylvania Guidance Center, where the 
Present writer was serving as Chief, were used 
Or this study. These records contained at least 


109 


the following pertinent information: age, sex, 
formal schooling, race, intelligence quotient (as 
obtained on the California Mental Maturity 
test, advanced form, timed), vocational in- 
terest profile (as per Kuder Preference Record) 
and personality inventory scores (Bell Adjust- 
ment Inventory, adult form). 


The Subjects 


The veterans employed in this study were 
white, native-born males who had a mean age 
of 24.5, S.D. 3.99; had completed 11.3, S.D. 
1.37, grades of schooling; and had an average 
IQ rating of 111.9, S.D. 14.78. This higher 
than average mental ability was due to two 
factors: (1) in order to minimize possible mis- 
understanding of the language elements making 
up the tests only those who had completed at 
least the Eighth Grade were selected; and (2) 
many university students dropped in for ad- 
visement purposes because the guidance center 
was so conveniently located for them. 


Results 


The obtained data for the interest test were 
not essentially different from the norm popula- 
tion. Furthermore, these data were generally 
in agreement with Rose's veterans who were 
also tested at a university Veterans Admin- 
istration Guidance Center (14). 


Table 1 
Scattergram Showing Relationships Between Kuder Spread-of-Interest Scores and 

Bell General Adjustment Status Scores 
— 

Bell General Adjustment Status Scores* 
Se 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 101-110 Total 
ž 701-751 1 i 
É 63-700 i 1 : 
à 60-650 1 0001 1 : 4 
g 351600 | 2 1 2 1 2 i in 
$ 50-550 1 3. 3 1 4 1 1 " 
S 435-50 1 2 5 1 4 1! 2? ie 
S toa g gg 5 2 d 1 a: 
E $3100 a g 3 3 1 1 
& 201-350 3 2 2 8 = 
& 251-300 ï o x d 5 
$ 10230  ; i 34 2 2 3$ 1 Ü 

Total — i9 42 26 d 2 1 9 2 2 2 t dé 


* 
B 
ay ased on total raw scores. 


ased on standard score deviations from individual means. 


110 Samuel F. Klugman 


Table 1 is a scattergram which shows how 
the various scores for General Adjustment 
Status and for spread of vocational interests 
are distributed. By inspection, it may be 
easily noted that the scores are widely scattered 
and do not follow a consistent pattern. This 
is further verified by the calculation of a 
product-moment correlation of .01-.07. It 
may be concluded, then, that no relationship 
was found between spread-of-interest scores 
(as measured by our suggested technique) and 
General Adjustment scores (as measured by 
the Bell Adjustment Inventory). 

In addition to the above product-moment 
correlation, others were determined for spread- 
of-interest scores and the three factors of edu- 

cation (ie. final grade completed), age, and 
IQ rating. For education, a correlation of .31 
was obtained; for age .29; and for IQ rating .27. 
In all instances, the P.E. was .06. These cor- 
relations are significant at the .01 level and 
each relationship may be described as "present 
but slight." It may be said, then, that the 
veterans who were older, more intelligent, and 
possessed more formal schooling tended to have 
greater spread-of-interest scores than those who 
were vounger, less intelligent, and had com- 
pleted fewer years of schooling. 

Table 2 gives a series of relationships be- 
tween five factors and the various Kuder 
interest areas. In dividing the group into 
halves, those with percentile scores above 50 
for a particular interest area were placed in the 
Stronger interest group and those with per- 
centiles below 50 in the Weaker interest group. 
"Those with scores of exactly 50 were put in the 
half with the lesser number of cases in order to 
keep the two groups as nearly alike in size as 
possible. The interpretation of critical ratios 
generally follows Guilford (10, p. 61). For 
those interested in interpretations based on 
Student's distribution for a group as large as 
this, a C.R. of 1.98 indicates confidence at the 
5 per cent level; 2.36 at the 2 per cent level; 
and 2.63 at the 1 per cent level. In terms of 
factors studied, these results were obtained: 


pue ur t uo e rer e 

1. In no area did spread of vocational in- 
terest scores yield differences beyond reason- 
able doubt between those with stronger and 
weaker vocational interests. 


2. Reliably better adjustment scores were 


obtained by those with stronger Scientific in- 
terest scores. Those with weaker Artistic 
interest scores earned scores which indicated 
fair certainty of difference in theirfavor. Had 
the Social Service interest scores yielded a C.R. 
of 1.98 instead of 1.92, it would have been 
significant at the 5 per cent level of confidence 
in favor of those in the weaker half. It 15 
interesting to note, in passing, that Kimber 
had found that Bible students had “less free- 
dom from nervous symptoms than the average 
person, the prevailing interest was in social 
service, and there was a noticeable lack of 
interest in computational and clerical activi- 
ties" (11, p. 233). It will be seen that the 
present findings are in agreement because those 
with stronger Social Service, weaker Computa- 
tional, and weaker Clerical interests had pooret 
adjustment scores. However, none of the dif- 
ferences is significant although, as pointed out 
above, there is a relatively high level of con- 
fidence for the Social Service area. 

3. Only in the Mechanical area was there 
obtained a difference large enough to point to & 
relationship with IQ ratings. With fair cef 
tainty, it was found that those with stronger 
Mechanical interests earned lower IQ ratings: 
An interesting result may be noted with rel- 
erence to the Scientific category; namely; - 
differentiated least between those who like 
and those who disliked scientific activities as 
stated in the Kuder test. This is contrary 
to a conclusion drawn by Super (17, p. ^ i 
from Strong’s findings (16). On the other 
hand, it supports his statement concernins 
Social Service interests (p. 24). jt 

'These findings are essentially in agreeme! a 
with those of Adkins and Kuder who, using © 
test of intelligence, stated that “. . . the int d 
pretation of preference scores as indicative ©. 
the presence or absence of special abilities” 
unwarranted by the results of this ivi 
tion" (1, p. 261). Similarly, Strong writes t e 
“the differences are too slight to be usa e » 


prediction of individual cases" (16, P- "Ve 
Triggs' highest coefficient of correlation is] 
tween Kuder interest scores and AC onc 


scores was only .27 (Literary category): 3 49): 
again indicating low relationship (18, P: > to 


4. Age apparently bore no relations! go? 


any of the interest areas, since all critical f 
were even below the lower criterion of ^ 


111 


Spread of Vocational Interests 


"(ojpua21ed yF aopaq) dnoss js219]ut JaywaM : M s 
*(ajquassad yog 24oqe) dnoi8 js2193ur 193u01]S :S y 


1£'9l : j . 
z : : m 19% 968 LTI 6€ O M 
191. LH oct "p orci . : ` . . : c 
LEZ OSI 60l ort 91€ 19 6761 69E OL Sci yt ys S (eer 
wel : “6 : : 

gem bm . ite Ee SUSI gee ICT pE E M 

m E ep XE E £61 Ore; STF 6c  8€1 Ct 9€ — S 9018 [ePOS 

q p c4 "c . ~ 

— M n s i . dep ose stl r? O M 

EL t qul "E So's ! ori graz OF ze — CFT e oOo sS esn 
et rU or Y ie ^" po £601 soor Tor STI vv 0 M 

ET wt gu ist spe e | f/6b VHN T6 LSH V9 sc zet £* $8 S KWINT 
Sei s ig ose m. n coe 1681 ^ Vif Stl vt € M 

ozi O81 rH oot sez GFT NY Eost vron 661 TOST TTH i9 = SET OF sc S onsnay 
for VU we TET .. 1081 860 zgor tOr sel £* f£ M 

16. OF vH wi we Oc OT STPI S'en 96 — Mt soe ig ogi Tb € S dAISENSIOq 
zr cH o't stc WL LM [n Ssh 6Cl OF zw M 

oV i El 10 SEE OFT Wo ecen tål ose seor LIE t — 8901 er 9 S »ynueps 
HI GOL 66€ LF ELI Tod let? SUE 6I OF S MN 

we S LU cep — gr € oT OFOT gt 19T SLL 93€ If 901. £T es — S  peuonvinduo, 
2o on Sp OIL aig vog Scl LF PS M 

ic 191 oo tV£l 680l sor osor ror ssl FT 9€ LES: quorueuoo]y 

AU WIS UE "s wW WO "as WN WO "a's uray N 
uorvonp^;r saunvw O'I 3uoursnfpy [e1ouar) $)s219]uJ jo peaudg 


$10]5p, sno A 10} Sdnoac) 7$919}UT 193?2A4 PU 123u01g u224]2g suq 


TARL 


145. Samuel F. 

5. Education seemed to be significantly re- 
lated to more interest areas than any other 
factor investigated in this study. Those with 
weaker Mechanical, weaker Clerical, and 
stronger Computational interest scores had 
significantly completed more years of formal 
schooling. In this connection, Crosby ob- 
tained results that “show a positive relation- 
ship between interests in certain scales of the 
Preference Record and achievement in some 
school subjects" (6, p. 103). Triggs also 
found relationships which “tend to indicate 
that interests and achievement are not totally 
unrelated" (18, p. 354). The present writer, 
on an earlier occasion, when he investigated 
the effect of schooling on clerical interests, 
stated, “. . . it is not maturation which is re- 
sponsible for the improvement in scores . . . 
but probably schooling" (12, p. 95). 

The data in Table 3 are based on the findings 
reported in Table 2. Whereas, in the earlier 
table the relationships of the various factors 
are noted within each of the interest categories 
(Le., those with stronger are compared with 
those having weaker interests), the present 
table is concerned with the relationships of 


Klugman 


Furthermore, the data in this table deal with 
positive interest alone in the sense that com- 
parisons were made only between the stronger 
halves (above 50th percentile) of the interest 
categories. Finally, in Table 3 are the signifi- 
cant differences only, i.e., those critical ratios 
that are significant at least at the 5 per cent 
level of confidence. The sole exception is $0 
close to the criterion of 1.98 that it is not amiss 
to include it in this table. : 
One important fact to note in this table 1$ 
that the Mechanical interest category appears 
to be significantly related to all factors. Those 
with stronger Mechanical interests are less 
definite in their likes and dislikes of othe" 
vocational interests (as measured by the sus” 
gested technique) than are those with stronge" 
interests in the Clerical, Computational, Scien- 
tific, Social Service, Musical, and Literary 
areas. General Adjustment status appears to 
be poorer for them (since lower inventory 
scores are more desirable) than for those es 
strong Scientific interests. IQ ratings e 
lower for them, too, than for those T 
stronger interests in the Computational ca d: 
gory. In addition, those with stronger M 


n . . . H se 
these factors among the interest categories. — chanical interests were reliably older than tho 
Table 3 
Significant Inter-Category Difference Means for the Various Factors — 
F , E tegory 

actor Interest Categories Difference* C Be oue 
Spread of Interests Mechanical and Clerical adi as EE 
Mechanical and Computational .69 3.24 
Mechanical and Scientific 67 2.90 
Mechanical and Social Service .62 2.57 
Mechanical and Musical 63 2.54 
Mechanical and Literary 61 252 
General Adjustment Status Scientific and Artistic 10.52 2.56 
Scientific and Mechanical 8.68 2.45 
Scientific and Social Service 9.79 2.30 
Scientific and Musical 9.15 2.26 
Computational and Artistic 7.63 2.10 
IQ Ratings Mechanical and Computational 5.75 1.97 
Age Literary and Mechanical 1.51 2.07 
Persuasive and Mechanical 1.47 1.99 
Education Mechanical and Computational 2.45 
Mechanical and Literary 2.02 


penon n s 
Higher mean found in second interest category of each pair 


Spread of Vocational Interests 113 


with positive Literary and Persuasive interests. 
They also had less formal schooling than those 
with stronger Computational and Literary 
Interests. 

Those with Stronger interests in the Scientific 
area were better adjusted, to a fair degree of 
Certainty, than those with interests in Artistic, 
Social Service, Mechanical, and Musical fields. 
Similarly, those with positive Computational 
Interests. were emotionally more stable than 
those with Artistic ones. 

It is interesting to note that Education, 
Which was the factor related to most intra- 
Category differences, did not play so important 
a role in inter-category differences. 


Summary 


The records of 108 veterans who had applied 
Or advisement at the Veterans Administration- 

versity of Pennsylvania Guidance Center 
Were selected at random in order to determine 
What, if any, relationship existed between 
Vocational interest and personality inventory 
Scores as measured by the Kuder Preference 

cord and the Bell Adjustment Inventory, 
respectively, Of special interest was the de- 
termination of the value of a new measure of 
Spread of vocational interests based on mean 
deviations of an individual’s standard scores. 
" addition, the factors of age, IQ ratings, and 
& "cation were studied in relation to interests. 

e following findings were obtained: 


1. Spread of vocational interest, as measured 
this study, showed no relationship with 
woe Adjustment Status. This lack of 
“ationship was also found in a comparison 
*'ween those with stronger and those with 
Veaker interests in the various areas. 

There was a slight tendency for the older, 
SUE educated, and more intelligent veterans 
ave greater spread-of-interest scores than 
Younger, less educated, and less intel- 
‘Bent ones, 
ere, SPread-of-interest scores succeeded in dif- 

à H lating between those with stronger Me- 

nical Interests and stronger Clerical, Com- 

ancetional, Scientific, Social Service, Musical, 

least lterary to a degree of fair certainty at 

of is Tn all instances, the mean spread score 

the „28er Mechanical interests was less than 
ean spread score of the others. 


to 
he 


+. Veterans with stronger Scientific interest 
Scores earned reliably better adjustment scores, 
as did those with weaker Artistic interest 
Scores (but to a lesser degree), than their 
opposite halves. General Adjustment Status 
for those with stronger Scientific interests was 
significantly better than for those with Stronger 
Artistic, Social Service, Mechanical, and 
Musical interests. To a lesser extent, this 
is also true for those with positive Computa- 
tional interests over positive Artistic ones. 

5. IQ ratings did not differentiate within 
any of the interest categories except for the 
Mechanical area where those with weaker 
Scores earned higher IQ's to a degree of fair 
reliability. A significant difference’ (at the 5 
per cent level of confidence) was obtained, on 
the basis of these Scores, between those with 
positive Mechanical and Computational in- 
terests. 

6. The factor of age did not succeed in dif- 
ferentiating any of the veterans on the basis of 
Stronger versus weaker vocational interest 
Scores. Significant age differences (at the 5 
per cent level) were found between positive 
Mechanical and positive Literary and Persua- 
sive interest areas respectively. Those with 
stronger Mechanical interests were older than 
those with high interests in the other two fields. 

7. Education, more than any of the factors 
studied, was related to a significant degree 
when studied on an intra-category basis. How- 
ever, this factor was not so effective in differ- 
entiating among those with Positive interests, 

8. The Persuasive, Literary, and Musical 
areas of interest were found not to be related 
with certainty to any of the factors investigated 
when studied on an intra-category basis, At- 
tention is called to the Social Service area which 
is barely outside the 5 per cent level of con- 
fidence in relation to General Adjustment 
Status. When investigated on the basis of 
positive inter-category interests, however, all 
were found to be related to one or more of 
the factors. 

9. The Mechanical interest area was related 
to all the factors on an inter-category basis with 
at least fair certainty. On an intra-category 
basis, the same relationship was found for all 
factors except spread-of-interest scores. 

10. Relationship between a personality in. 
ventory total score (Bell Adjustment) and 


114 Samuel F. 


vocational interest test scores does not appear 
to be greater when the Kuder Preference 
Record is used than when the Strong Voca- 
tional Interest Blank is employed. To use 
Strong's descriptive statement, the scores are 
“related somewhat." Therefore, it does not 
appear to be feasible, at this time, for voca- 
tional counselors, clinical psychologists, and 
others who make use of these types of tests, 
to infer the presence of definite relationships 
except in a limited manner. 

Received June 20, 1949. 


References 


1. Adkins, D. C., and Kuder, G. F. The relation of 
primary mental abilities to activity preferences. 
Psychometrika, 1940, 5, 251-262. 

2. Alteneder, L. E. The value of intelligence, per- 
sonality, and vocational interests in a guidance 
program. J. ed. Psychol., 1940, 31, 449-459. 

3. Bell, H. Manual for the Adjustment Inventory. 
Stanford University Press, 1935. 

4. Berdie, R. F. Range of interests. J. appl. Psy- 
chol., 1945, 29, 268-281. 

5. Bingham, W. V. D. Aptitudes and aptitude testing. 
New York: Harpers, 1937. 

6. Crosby, R. C. Scholastic achievement and meas- 
ured interests. J. appl. Psychol., 1943, 27, 101- 
104. 

Darley, J. G. A preliminary study of relations 
between attitude, adjustment, and vocational 
interest tests. J. ed. Psychol., 1938, 29, 467-473. 

8. Darley, J. G. Clinical aspects and inter pretation of 
the Strong Vocational Interest Blank. New York: 
Psychological Corp., 1941. 


~ 


Klugman 


9. 


10. 


13. 


14. 


. Super, D. E. Experience, emotion, and voc 


. Tussing, L. An investigation of the pos 


. Wechsler, D. The measurement of adult 


Fowler, F. M. Interest measurement,—questions 
and answers. Sch. Life, 1945, 28, 25-29. — 
Guilford, J. P. Psychometric methods. New York: 

McGraw-Hill, 1936. 


. Kimber, J. A. M. Interests and personality traits 


of Bible institute students. J. soc. Psychol., 
1947, 26, 225-233. 


. Klugman, S. F. Test scores for clerical aptitude 


and interests before and after a year of schooling: 
J. genet. Psychol., 1944, 65, 89-96. 

Kuder, G. F. Revised Manual for the Kuder Prefer- 
ence Record. Chicago: Science Research Asso- 
ciates, 1946. R 

Rose, W. A comparison of relative interest T 
occupational groupings and activity interests 3 
measured by the Kuder Preference Record: 
Occupations, 1948, 26, 302-307. 


š ation of 
. Sarbin, T. R., and Berdie, R. F. Relation 


B v ol 
measured interest to the Allport-Vernon stud) 


values. J. appl. Psychol., 1940, 24, 287-296. 


, 4 " qn 
. Strong, E. K. Vocational interests of men 


women. Stanford University Press, 1943. "i 
ation® 


choice. Occupations, 1948, 27, 23-27. 


: f guder 
. Triggs, F. O. A study of the relation of d 
Preference Record scores to various other 35 


E, & Psy 1943, 3, 341799. 
ures. Educ. & Psychol. Meas., “bilities of 
measuring personality traits with pare, 
Vocational Interest Blank. Educ. È 
Meas., 1942, 2, 59-74. a 


Voc 
. Tyler, L. E. Relationships between Strong d 


itude 
tional Interest scores and other atitude 29, 
personality factors. J. appl. Psychol. 


58-67. inicllige! ® 


Baltimore: Williams & Wilkins, 1944- 


Kuder Interest Patterns of Psychologists * 


Malcolm L. Baas 


Purdue Un 


The primary purpose of this study was to 
Secure information concerning motivations, in- 
terests, or preferences of professional psychol- 
ogists, and of graduate students in psychology. 
Significance of this purpose is reflected in such 
Studies as The Research Project on the Selection 
of Clinical Psychologists (6) being conducted at 
the University of Michigan under the super- 
vision of E, Lowell Kelly and Donald W. 
Fiske, 

Cognizance of such information may in- 
troduce Considerations in the guidance of 
Students toward or away from psychology; 
and also Considerations in the evaluation of 
Personality. patterns among practicing profes- 
Siona] Psychologists. 


Procedure 


Groups Used. Subjects from four divisions 
the American Psychological Association 
ue Selected for this study. "These divisions 
Vere: Clinical, Industrial, Counseling and 
E Uidance, and Experimental and Theoretical. 
Tom each division, sixty fellows were ran- 
(Po selected and subsequently contacted. 
Dus Participating in the study were groups 
cy Sraduate students in both Industrial and 
nica] Psychology at Purdue University. 
eith, Sraduate students were candidates for 
ond the M.S. or the Ph.D. degree at the 
me of this study. 


R ministration 
eco; 


d of the Kuder Preference 
ri . 


Plet The subjects were requested to com- 
an i. the Kuder Preference Record, Form BM, 
hall return it to the author. From the origi- 
Dé Y selected professional group of 260, returns 
mitted a statistical analysis of 29 Industrial, 

29 p ucal, 26 Counseling and Guidance, and 
Xperimental and Theoretical Psycho- 


* 
sc NE 
toti article is based upon the author's thesis, 


Cssi, tudy of Interest Patterns Among Pro- 
Purdue ie Chologists,” submitted to the Faculty of 
Tents fo, versity in partial fulfillment of the require- 
1949. xig E € degree of Master of Science, February, 


Shaw, Th thesis was directed by Professor Franklin J. 


115 


niversity 


logists, a total of 111. This is a 46 per cent 
return. None of the psychologists considered 
in this final group was a fellow in more than 
one of the divisions as listed. From the 
student groups, 21 Clinical and 25 Industrial 
Psychology Graduate Students were con- 
sidered in the final study. 

Treatment of Data. Mean raw scores! on 
each of the nine Kuder Scales were obtained 
for each of the Professional groups, and for 
the graduate student groups. Similarly, per- 
centiles, standard deviations, and standard 
errors were obtained. 

Using the Kuder Adult Norms for Men, 
percentile ranks for the mean raw Scores were 
found. The mean scores were also interpreted 
in terms of norms derived from the entire pro- 
fessional psychologist sample. 

To indicate significance of mean-differences 
between the divisions, Student's i-ratios were 
tabulated. These ratios serve to test the null 
hypothesis concerning the differences in means 
between professional groups, and between 
student and professional groups. 

All statistical data have been reported in 
tabular and profile forms, but because of their 
extensive length they must be deleted from 
this paper. 

Results 


For the entire professional sample (Table 1) 
two scales exhibited strength of interest. The 
Scientific and Literary scores were above the 
75th percentile of Kuder's Adult norms for men, 

The four divisions seem fairly consistent in 
reflecting the scientific and literary interests, 
For these two scales, only one significant 
difference in means occurred, The Scientific 
Scale difference between the means for the 
Theoretical and the Clinical Divisions was 
significant at the 2% level of confidence. In- 
asmuch as the difference favored the Theo- 

! Because of the extensive amount of statisti, 
the tabulations and profile presentations have been 


deleted from this article. The original thesis is & 
in the Purdue University Library. Is is filed 


cal data, 


116 


Malcolm L. Baas 


Table 1 


Kuder Mean Percentiles for All Groups 


Scales n pn 
Soc. 
: Serv Cler. 
Group Mech. Comp. Sci. Pers. Art. Lit o Mus. = Serv. 3 
7 5 () 60 
Psychologists in general 32 67 82 Po js e m 19 
Clinical 32 55 74 2 7 0 3 ab 
Industrial 40 75 82 45 40 87 50 5 42 
Counseling 23 6+ 77 33 30 86 55 4 30 
Theoretical 35 70 89 11 67 87 15 ms 19 
Student Clinical 8 22 32 59 46 76 78 d 9 
Student Industrial 26 70 40 76 12 71 su 
T TN we p dence 
retical Division, this outcome seems consistent was significant at the 5% level of confi 
with the divergent patterns of activity of the favoring the Industrial group. rw 
two fields. The middle-range mean scores also p^ rua 
Low interest areas for the professional psy- significant differences between iq en 
chologists were reflected in the mean scores for tested groups. The Artistic Sat EEE 
the Mechanical, the Persuasive, and the Cleri- tiated between the Counseling and In 


cal Scales. Significant differences in means 
were found on the Mechanical Scale, the 
Counseling Division’s mean score being sig- 
nificantly less than the mean score for either 
the Theoretical or the Industrial Divisions. 
Levels of confidence for the differences were 
5% and 1%, respectively. The Industrial and 
Counseling Divisions had higher mean scores 
on the Persuasive Scale than the other divi- 
sions. Industrial Psychologists evidenced a 
stronger interest on this scale than either the 
Clinical or the Theoretical Divisions, with 
significance at the 1% level of confidence. On 
the Clerical Scale, the difference between the 
means for the Industrial and Clinical Divisions 


pie : ; istnd 
divisions, favoring the Industrial DA p 
the 5% level of confidence. The S Divisio 
differentiated between the Theoretical 


se 
H M ose j the 
and the Counseling Division. For aet ical 
mean-difference comparisons, the The ah 


Division mean score was significant! * 
at the 2% level of confidence. No $i 
differences occurred for the Comp" 
Scale mean-score comparisons, but the 


i "Hc pasi NEN ean 
Service Scale exhibited significant Ine 


social 
dif 


i 
dp : "Theore 
Division over both Industrial and. ea v 
Divisions; and the Counseling DIV!S 


" : PW arisi 
ferences in all the inter-division compe eñ 
For the Social Service Scale, mean-di clinic”! 
at the 1% level of confidence favor the ical 

-ile M of 
(Sample Standard Mean 
Norms) Deviation "m 

50 17.5 10 


Table 2 
Profile Statistics for 111 Professional Psychologists 
Mean %-ile M 
Raw (Kuder 
Scale Score Norms) 

Mechanical 68.9 32 
Computational 40.0 67 50 10.1 12 
Scientific 718 82 50) 12.7 n | 
Persuasive 51.7 24 50 17.6 15 i 
Artistic 48.2 59 50 162 12 
Literary 64.8 86 50 12.1 08 
Musical 17.4 60 50 8.8 2.0 
Social Service 78.2 60 50 220 4 
Clerical 44.7 30 50 126 : 


Kuder Interest Patterns of Psychologists 


both Industrial and Theoretical Divisions. 
The Social Service mean-difference between 
the Industrial and Theoretical Divisions favors 
the Industrial Division at the 2% level of con- 
fidence. And the Counseling Division’s Social 
Service mean score is significantly higher than 
the Clinical Division’s mean score at the 5% 
level of confidence. 

Significant differences also prevailed be- 
tween the mean scores of the professional 
Psychologists and their student counterparts. 

nly the Literary and the Clerical Scales failed 
to differentiate between the Professional Clin- 
ical Psychologists and the Clinical Psychology 
Graduate Students. With regard to the Pro- 
fessional Industrial Psychologists and the 
Students of Industrial Psychology, four scales 
failed to show differences in means. The four 
Scales were: The Mechanical, The Computa- 
tional, The Musical and The Clerical Scales. 
n view of these results it seems possible that 
Preference patterns are considerably affected 
Y professional experience, although Strong’s 
Work (9) would not lend support to this view. 

'S possible that interest patterns will become 
More Stable and that strong interest areas will 

come better established as experience con- 
tributes toan individual’s preferences. Several 
ne eSligators (1, 2, 4, 7, 11) have indicated, 
1 Vertheléss, that the Kuder Preference Record 
an be of help in selecting and guiding students. 


Summary 
Th 


ae this investigation an effort was made to 
nti 


UY characteristic interest patterns for 
Ped different groups of professional psycholo- 
5 Sand for two groups of student psychologists. 
re Ploying the Kuder Preference Record, pref- 
Por of groups selected from four American 
r on logical Association divisions and two 
Obtained. Psychology graduate students were 
ere aving received the completed Kuder Pref- 
ellows ‘ecord forms from sample groups of 

ouns ne the Clinical, the Industrial, the 
and cüng and Guidance, and the Theoretical 


analy. —"Xperimenta] Divisions, a statistical 
anaes of these results was made. A similar 
data SIS Was performed on the graduate student 


117 


The findings revealed high scores for all pro- 
fessional psychologists on the Scientific and 
Literary Scales of the Kuder Preference Record. 
Weak interests were evident on the Me- 
chanical, the Persuasive, and the Clerical 
Scales. Several significant differences þe- 
tween the groups studied were found on the 
various scales. " 

The findings suggest that professional ex- 
perience may contribute to preference Scores, 
although Strong's work on age changes in in- 
terest patterns would not support this. How- 
ever, this particular investigation has not in- 
corporated an examination of subjects? early 
interest backgrounds. Age differences, ex- 
perience differences, and environmental in- 
fluences should be considered. And the small 
samples, as well as having used only Purdue 
University graduate students, are also limit- 
ing factors. 


Received July 5, 1949. 


References 


l. Adkins, Dorothy C., and Kuder, G. Frederick. 
The relation of primary mental abilities to active 
preferences. Psychometrika, 1940, 5, 251-262. 

. Bolanovitch, Daniel J., and Goodman, Charles H. 
A study of the Kuder Preference Record. Educ. 
& Psychol. Meas., 1944, 4, 315-326. 

3. Crosby, R. C., and Winsor, A. L. The validity of 
student estimates of their interests. Ma appl. 
Psychol., 1941, 25, 408-414, 

+. Hahn, Milton E. Notes on the Kuder Preference 
Record. Occupations, 1945, 33, 467-470. 

5. Hahn, Milton E., and Williams, Cornelia T. The 
measured interests of Marine Corps Women 
Reservists. J. appl. Psychol., 1945, 29, 198-211. 

6. Kelly, E. Lowell, and Fiske, Donald W. The selec- 
tion of clinical psychologists. Ann Arbor, Mich.: 
University of Michigan, 1948. 

. Kimber, J. A. Morris. Interests and personality 
traits of Bible Institute students. J. soc. Psy- 
chol., 1947, 26, 225-233. " 

. Kuder, G. Frederick. The Kuder Preference. Rec- 
ord: A revised manual. Chicago, Illinois: Sci- 
ence Research Associates, 1946, 30 pages. 

9. Strong, E. K., Jr. Change of interest. wit]; age. 
Stanford University, California: Stanford Usi- 
versity Press, 1931. 

10. Strong, E. K., Jr. Vocational interests of men and 
women. Stanford University, California: Stan- 
ford University Press, 1943, 

11. Yum, K. S. Student preferences in divisional 
studies and their preferential activities, J., Psy- 
chol., 1942, 13, 193-200. 


N 


g 


A Study of Shooting Glasses by Means of Firing Accuracy * 


Sherman Ross 
Bucknell University, Lewisburg, Penusylcania 


The use of “shooting glasses” is apparently 
a very old idea in the field of hunting or of 
range firing. In the museum at Fort Necessity, 
a pair of old-fashioned spectacles with red glass, 
labelled: “Hunting Glasses, about 1830” are 
to be found. 

There have appeared on the commercial 
market several different types of glasses, with 
an assortment of filters, which have been 
declared to be shooting glasses. The major 
purpose of these glasses is to provide better 
resolution of the target, under conditions of 
glare, haze, etc. The glasses have been pro- 
duced commercially under such trade names as 
“Rayban,” “Calobar,” and “Eaglesight.” 

The use of filters in “cutting haze” and in 
providing better resolution of targets is a well- 
established notion in the field of vision. In a 
recent report, Verplanck! has studied the 
effectiveness of selected neutral and colored 
filters in extending the visual range at which 
neutral targets, viewed against a sky back- 
ground, may be discriminated in the presence 
of haze. The results of the experiments indi- 
cate that the use of colored or neutral filters 
did not increase the range at which neutral 
targets, viewed against a sky background, may 
be discriminated. The present experiment 
deals not with the problem of discrimination, 
but with the complex of visual-psycho-motor 
variables represented by the actual firing of a 
rifle at a target. The purpose of the present 
study was to determine the effectiveness of 

“This study was performed at the Medical Field 
Research Laboratory, Camp Lejeune, North Carolina, 
during the fall of 1944, with the cooperation of the 
Range Battalion, Camp Lejeune. Opinions expressed 
in this report are those of the author and are not to be 
construed as official or as representing the views of the 
Navy Department or of the Naval Service at large. 
The writer should like to express his thanks to S. B. 


Williams, College of William and Mary, and to S. B. 
Lyerly, University of North Carolina, for their help on 
the manuscript. 

, ! Verplanck, W. S. A field test of the use of filters 
in penetrating haze. Report on Bu M & S Research 
Project HM-011-003, dated 6 June 1947. From the 


Medical Research Department, U. S. Submarine Base, 
New London, Conn. Pp. 11. 


1 ic f j sed as 
several types of plastic filters when ned a 
shooting glasses by a group of skilled Marin 
Corps riflemen during range firing. 


Method and Procedure 


It will be convenient to discuss the method 
and procedure under the following bMS 
(1) Subjects, (2) Filters, (3) Firing Schedules 
and (4) Method of Scoring. 


1. Subjects: The subjects for the experimen 
were a group of 21 riflemen with the hig nge 
qualifications, selected from a group of 28 € 
coaches who had reported for duty during 
previous week. r 
à 2. Fillers: Six types of filters and one de 
control were used. The characteristics l 
these filters are shown in Table 1. pers 
shows the Polaroid Corporation code num e 
the polarization properties of the pee 
color, and the transmission character! : 
The plastic filters were placed in ider 
rubber mounted frames (Polaroid 1065), 
were marked with identifying letters- pree 

3. Firing Schedule: The tests ran mF weet 
consecutive days. Firing was begun o and 
0815 and 0845 and ended between 11 d 
1130. The weather was clear and pex the 
there were relatively few clouds durit i nt 
tests. The sun came from over the qe 
shoulders of the men on the firing d of 
subjects were divided into three apt ven 
seven men each. This was done for p. 
ience in handling the goggles and M ai ; 
the targets. Sighting in of the rifles Wis slo 
each day previous to firing. Eight sho targ 
fire, prone at 200 yards with a standare’ = ys € 
constituted a string. The M-1 rifle y: of th 
throughout the tests. The daily plan gent 
firing for each group of seven men 15 P 


in Table2. As will be seen, the firing ety 
w el 


jer 
goggles. However, when goggles we 
each man used a different type. E eni 
was fired by the first group of seve? 


118 


-1 Study of Shooting Glasses 119 
Table 1 
Showing the Characteristics of the Filters 
= Polaroid " 
Filter Corp. No. Polarization Color Transmission Characteristics 
A XC92 Non-Polarizing Neutral Essentially complete transmission throughout 
visible region except for reflection losses 
B XN10 Non- Polarizing Neutral Approximately 10% overall visual transmission 
c HN32 Polarizing Neutral Approximately 32% overall visual transmission 
D MG30 Polarizing Neutral Approximately 30% overall visual transmission 
E XY68! Non-Polarizing Amber Approximately 68% overall visual transmission 
F XG30? Non-Polarizing Green Approximately 30% overall visual transmission 
G X050 Non-Polarizing Light Red Approximately 50% overall visual transmission 


! This filter selected because of its resemblance to the commercially produced “Rifleite” 
his filter selected due to its resemblance to the commercially produced “Rayban” and “Calobar” 
it resembled the commercially produced “Eaglesight” shooting glass. 


2 


a H " 
This filter selected since 


second group of seven men fired Relay 1. 

ter Relay 1 was fired by the third group of 
Seven Subjects, the first group fired Relay 2, 
etc. The firing for a single day for each sub- 
Ject therefore was composed of 14 strings of 
eight Shots, seven strings without goggles, and 
9ne String with each pair of goggles. 

* Method of Scoring: The groups were 
Plotted immediately by spotters in the pits on 


shooting glass. 
glasses. 


reduced score sheets (1 inch equal to 1/16 
inch). Each hit was plotted exactly in ac- 
cordance with the location of the hit on the 
target by the score keeper. These score sheets 
provided the basic data for the analysis of 
the results. Not more than 5 per cent error 
inheres in this method of scoring. 

After each day's firing, the data were ana- 
lyzed according to the standard method known 


"Table 2 
Sage Firing Schedule 
Subject 
Reay String j A2 A3 A4 A5 A6 Ay 
s 1 None* None None None None None None 
2 A" B** c D E F G 
3 3 None None None None None None None 
4 B C D E F G A 
: 5 None None None None None None None 
6 € D E F G A B 
a 7 None None None None None None Notte 
8 D E F G A B c 
3 9 None None None None None None None 
10 E F G A B c D 
4 n None None None None None None None 
12 F G A B [o D E 
t 13 None None None None None None None 
14 G A B C D E F 


* " 
m Vithout goggles, 
OEBle “A,” goggle “B,” etc. 


120 


as the Mean Radius Method. In this method 
the Group Center is located by a simple graphic 
process. The distance (linear in all cases) 
between the determined Group Center and the 
Center of the target is known as the Group 
Off Center measure. When the Group Center 
has been determined, the deviation in inches 
of each shot from this point is measured. The 
average deviation is known as the Mean 
Radius, and indicates the variability about this 
center. We than have a measure of central 
tendency (Group Center) and a measure of 
variability about this center. In the experi- 
ment, the best measure of firing accuracy is 
given by the Mean Radius score. The Group 
Off Center measure is subject to influence by 
constant errors such as improper sighting, wind 
shifts, etc. 

'The Method of Mean Radius is a method 
used to obtain a precise measure of the 
variability of a group of shots. It is based 
upon some simple statistical concepts. The 
method as used by riflemen is as follows: 

Given the distribution of shots, two coordinates are 
drawn as follows: (1) a horizontal coordinate through 
the lowest shot in the group, and (2) a vertical coordinate 
through the shot nearest tlie left hand side of the target. 
From the horizontal coordinate, the distances are 
measured (in inches) between the coordinate (lowest 
shot) and each other shot. These distances are aver- 
aged, and a new horizontal coordinate drawn on the 
target as a dotted line. The same operation is per- 
formed with the vertical coordinate. When the new 
vertical (dotted) line is drawn in, the point of the inter- 


section of the two dotted coordinates is known as the 
“Group Center." 

From this Group Center, the shortest linear distance 
(in inches) between it and each of the shots is measured 
and recorded. The average distance is determined for 
each distribution of shots fired and is called the “Mean 
Radius." 


It can be readily seen that this is a quick 
method of determining a measure of central 
tendency (the Group Center) and a measure of 
dispersion around that center (The Mean 
Radius). 

Results 


Basic Data: The basic data upon which all 
the calculations were made are not shown in 
this report. The Mean Radius score (in 
inches) for each string of shots under each 
testing condition for every subject was cal- 
culated for each day of firing. An inspection 
of Table 3 reveals that the average Mean 


Sherman Ross 


Table 3 
Summary Table Showing Average Mean Radius Scores 
Without Gogglés. «eee een 4.2 
Goggle A n: me 
Goggle B... P M4 
Goggle C.. m i 
Goggle D.... à a 
Goggle E... i 
Goggle F..... ` snares e 
Gopgle Qu iussa er iira vc "Ra 
" wis raried 
Radius, regardless of test condition, varie 


from 3.9 to 4.9 inches, and the standard devia- 
tion varied from 0.8 to 1.4. However, à gier 
of the variability of the average Mean Ra 

does not indicate any significant trend. 


seg vacy: “The 

Summary of Data on Firing Accuracy: each 

averages of the Mean Radius scores for eh 
and with ea 


test condition (without goggles E 
type of goggle) summarized for all ie 
firing are presented in Table 3. The evie x E 
suggests the conclusion that the pue 
any goggle tends to reduce firing s 
Except for goggles “E” and “C”, the“ fring 
goggle” control firing was superior n «AP 
with goggles. Hence, the control BOBE? a 
(clear plastic) is probably the best sta! 
with which to compare the effect 0 a 
filters, rather than the “without goggle ences 
tion. Whether or not the small differ pe 
obtained can be regarded as significant n 
discussed in the following section. "m 
Significance of Obtained Differences signi” 
Test Goggles: In order to estimate the Se vit 
cance of the differences between shootiné ies 
various goggles and shooting without BOP he 
the following statistical test was made. enc 
largest average Mean Radius score n ep 
was taken from Table 3. This differe™ pe 


average Mean Radius between gogg 7 ati? 
(4.1) and goggle “A” (4.5). The Critic - 965 
of this difference was found to be 1-85 ce 
chances in 100 that the obtained differen: le 
a real one and not due to chance. T en J 
of significance is below the one usually 2° sho™, 
in work of this kind. Thus, we have if 
that even the largest of Mean RE ica, 
ences, though suggestive, is not 5i? si e 
Significant. It was therefore not Cageren 

necessary to investigate the other a 

along these lines. 


A Study of Shooting Glasses 


Daily Trends: In Figure 1 is shown the dis- 
tribution of Mean Radius scores (without 
Eoggles) for the first, second, and third days of 
firing. In Figure 2 is shown the distribution 
of Mean Radius Scores with goggles for the 
Same three days of fire, It will be noticed 
that there is no significant day-to-day trend in 
either of the two figures, and that the distribu- 
tions may be Superimposed. It may therefore 
e concluded that the continued wearing of 
the various shooting goggles did not in any way 
Significantly increase the marksmanship score. 


40 


NUMBER OF STRINGS. 


121 


-lnalysis of Variance: An analysis of variance 
was made, the results of which are summarized 
in Table 4. The following interpretation is 
offered. The differences among the goggles 
are clearly non-significant, and there is no 
evidence of day-to-day general shifts in firing 
accuracy. 

The differences among the subjects are highly 
significant which is, of course, not surprising. 
There is no statistical significance in the goggles 
X subjects interaction. In other words there 


is no evidence that, for example, goggle A is 


Q— ——O First DAY 
@——® seconp pay 
O----- -O THIRD pay 


40. 


rj 
z 


a 


NUMBER OF STRINGS. 


T— T 
30 35 40 as so 55 60 6s 7 75 e0 8s 90 ob do 
MEAN RADIUS SCORE 
Fic. 1. Daily mean radius score of 21 subjects firing without goggles. 
O————-0 FiRST ay 
@——® sco pay 
Q------- -O THIRD pay 
* 
3 
\ 
\ 
S 
S 
M 
\ 
` 
` 
Y 
— 
30 35 40 45 so 55 60 65 70 75 80 85 90 95 mo 


MEAN RADIUS SCORE 


Fic. 2. Daily mean radius score of 21 subjects firing with goggles. 


122 Sherman Ross 
Table 4 
Analysis of Variance Nu 
Sums of Degrees of uv F 
Source of Variance Squares Freedom Variance = p 
5 3 (non. sig.) 
1. Goggles 5.3602 6 nS F «10 
s 23 (non. sig.) 
2. Days 1.2456 2 .623 Pei 
5 7 7.143 8.27 
3. Subjects 154.8697 20 7.743 (156 level 
s zm 1.14 
4. Goggles X Subjects 128.2571 120 1.069 me. 
. 5 2087 1.74 
5. Days X Subjects 65.2087 40 1.630 (166 leve 
77 77 2.97 
6. Goggles X Days 33.3077 12 2.776 a% level 
7. Error 224.7513 240 936 
8. Total 613.0183 440 


— 


better than goggle B for subject 1, but worse 
for subject 2, etc. The days X subjects inter- 
action is highly significant and means merely 
that the relative standings of the subjects in 
firing performance is a function of the day on 
which performance is measured. In other 
words, subject 1 may be better than subject 
2 on one day, but poorer on another day? 

The value for the goggles X days interaction 
is highly significant and poses a real problem. 
It must be recalled that the two simple vari- 
ances, goggles and days were not significant. 
This may mean that one goggle is significantly 
better than another on one day, but not on 
another day. Over the period of days, the 
differences could cancel out. 

Several factors might have varied during 
the test situation. For example, the condi- 
tions of light, haze, etc., might have varied 
slightly from day-to-day in ways which would 
affect the relative efficiency of the several 
goggles. 

There may have been differences in the sub- 
jects which could have come into play. The 
preferences of the subjects might have in- 

d ? This. is also to be expected, although it is a prin- 
ciple which athletic coaches are more familiar with (and 
possibly deal with more effectively) than experimental 


psychologists. "There may be changes in general fitness 
Írom day to day, level of aspiration, etc. 


5 re 
fluenced the day's firing. The subjects we 
interviewed in regard to their preferences 
the stated preferences bore no particu 
lationship to performance. 


Summary 


Six different types of plastic filters o, 
tested as shooting glasses at the Rifle eura 
Camp Lejeune, North Carolina. Ame sub” 
clear plastic was used as a control. m 
jects were 21 riflemen with the highest d rifle 
cations. They fired the standard Jj 
from the prone position at an “A” targ 
a distance of 200 yards. d on 

The shots were plotted immediately ie 
duced score sheets of target desit: and 
accuracy of fire was determined by 
ard method, known as the Mean d 
method. in Pos 

No statistically significant differenc? pad 


formance, as determined by the M P y 
ái 


Scores, was found between firing Wa 
and firing without glasses. No sign! be 
to-day trends were found. qat iff 

The major conslusion reached per. 
use of the plastic filters did not enh? 
firing accuracy. 


Received August 9, 1949. 


Evaluation of Aircraft Instrument Displays for Use with the 
Omni-Directional Radio Range (VOR) * 


A. C. Williams, Jr., and S. N. Roscoe 


University of Illinois 


The omni-directional radio range (VOR) 
Provides the pilot with continuous visual infor- 
mation concerning his bearing to or from an 
Omni-range station. This makes it possible 
to fly to a station along any desired track or 
Tom a station in any desired. direction, repre- 
Senting a considerable advantage over the con- 
ventional four-leg auditory radio range. In 
addition, VOR employs a very high frequency 
Signal which provides static-free reception. It 
Is, however, limited to line of sight distance. 

he Purpose of the present study was to 
Evaluate selected aircraft instrument displays 
which Could be used with VOR. The evalua- 
lon was not based on engineering criteria, 
Since the human engineer is not primarily con- 
cerned with the operation of the machine as 
Such, but rather with the performance of the 
man Operating the machine. Thus the de- 
Pendent variables which were measured and 
Compared were the speed and accuracy with 
Which Tepresentative pilots could use the dis- 
Plays in question. An inspection of existing 
e instrument displays was sufficient to 
genus that pilots would not be able to use 

Se instruments either quickly or without 
Tequent errors because of the inherent ambi- 
Suity in the immediate information presented. 


ti The Problem seemed essentially to be one of 
all s of display. Information presented by 


“splays was correct, adequate and clearly 


l : 
Bible, but it was often difficult to interpret. 
i; this study, three new displays were devised 


ch, it was believed, might represent an im- 
th: 
Uni tis article is based upon research conducted at the 
Dices Hd of Illinois, Urbana, Illinois, under the aus- 

int). the National Research Council Committee on 

"5 Sychology, with funds provided by the Civil 
" elton Administration. sius - 
Disty, ^. W. Psychological problems in cockpi 
dist, tentat ton for the omni-directional range (ODR) and 
D. measuring equipment (DME). Washington, 


February tts ivision of Research, Report No. 76, 


provement in ease of interpretation compared 
with existing displays. For comparison these 
three were evaluated in conjunction with five 
conventional displays of varying design. In 
all, eight displays were tested and compared. 
They are illustrated in Figures 1 to 8, inclu- 
sive^ They were: (1) Conventional Symbolic 
Indicator, (2) Air Line Indicator, (3) Air Force 
Indicator, (4) Experimental Symbolic Indica- 
tor, (5) Radio Magnetic Indicator, and (6), (7) 
and (8) Pictorial A, B and C, respectively, 
special displays designed for this study. For 
illustrative purposes Figures 1 and 6 only are 
presented here: 


not 
[i 


Fic. 1. The CONVENTIONAL SYMBOLIC INDICATOR. 
The Course Line Selector reads: your selected course 
is 170 degrees. The Ambiguity Indicator reads: your 
selected course is TO the station. (If you were South 
of the station, it would read "FROM," indicating that 
your selected course of 170 degrees radiated out from 
the station in that direction.) The Course Line Devia- 
tion Indicator reads: you are to the LEFT of your 
selected course. (Without the Directional Gyro read- 
ing of 168 degrees you would know only your approxi- 
mate position and would have no way of knowing in 
which direction you were flying at the moment.) 


? To reduce printing costs, Figures 1, 2, 3, 4, 5, 6,7 
and 8 and Tables 3, 4, 5, 6, 7, and 8 have been deposited 
with the American Documentation Institute and may 
be ordered as Document 2742 from American Docu- 
mentation Institute, 1719 N Street, N.W., Washington 
6, D. C., remitting 50¢ for microfilm (images 1 inch 
high on standard 35 mm. motion picture film) or 60¢ 
for photocopies (6 X 8 inches) readable without optica] 
aid. 


123 


124 


anra n 


DIRSCTION PI? wut rot 
MIT PRESENT HEADING 


DG 


Fic. 6. PICTORIAL A. s 
display representing an area about the VOR station. 
North is always at the top. The station is the small 


This instrument is a visual 


circle in the middle of the scope. Your airplane ap- 
pears as an arrow-shaped pip which shows the actual 
spatial relationship of your airplane to the station. 
The arrow head of the pip shows the heading of your 
airplane which can also be checked by referring to the 
Directional Gyro. The Track Selector serves only as a 
reference line indicating your desired track to or from 
the station, 


The information given by the eight displays 
shown is in each case the same. All indicate 
that the airplane’s present position is some- 
where to the North of the VOR station and that 
it is flying on a heading of 168 degrees. On the 
first four displays the airplane’s position is 
inferred from the fact that it is somewhere to 
the left of a hypothetical course of 170 degrees 
to the station. On the others its azimuth 
position is indicated directly. On the pic- 
torial displays the airplane’s distance from 
the station is also shown, although this infor- 
mation is not given by the VOR signal and 
was not necessary for the solution of the prob- 
lems used in the experiment. In all displays 
the airplane’s heading is indicated directly. 

For those unfamiliar with such displays, 
some idea of how they are used and of their 
relative difficulty can be gained by a simple 
exercise. In each case assume that the pilot’s 
task is: To fly to the station along a track of 170 
degrees, and then try to decide how he should 
fly in order to solve the problem. What is 
the first thing he should do? (In each case the 
pilot should first turn right to a heading of ap- 
proximately 2/5 degrees. This would cause 
him to intercept his desired track at about a 
45 degree angle, thus putting him in a position 
to turn to the heading of his desired track and 
fly directly to the station.) 

Designs (1) to (5) represent primarily sym- 


A. C. Williams, Jr. and S. N. Roscoe 


bolic modes of display, because they present 
information by means of numerical pointer 
readings, needle deflections, or numbers ra 
pearing in windows. Designs (6), (7) and ( ) 
were intended to be graphic or pictorial bed 
plays because the actual horizontal Spel 
relations between aircraft, station and heading 
are shown in miniature on the face of the € 
ment. In general, symbolic displays prov! 3 
pieces of information appropriate for a paces 
tive solution to a navigation problem, we A 
pictorial displays provide cues appropriate Bs 
a perceptual solution to a problem. In t 5 
case the dichotomy was not absolute, pesmi 
some of the symbolic displays also peos 
cues which could be interpreted perceptua Y 
while the pictorial displays provided 50 
symbolic quantitative information. — 

Considering these facts, an experiment e 
designed to measure and contrast the T 
and accuracy with which representative pa 
could use the eight displays in solving rep 
sentative air navigation problems. 


was 


Description of the Experiment 


The experiment was designed in such 
that variability in performance due wo li 
vidual differences among pilots as well asc 
ences among displays could be estimate ig? 
the analysis of variance technique. The urate 
did not provide specifically for an pon the 
estimate of the possible interaction betWe es oí 
various displays and the different UP ised: 
navigation problems for which they me kup 

The displays evaluated were picture T" yhe 
instead of actual flight instruments. 
mockups were made by drawing. * tions 
pictures of the displays, the indica" p he 
which were made to read as they WOW in ? 
real instruments had the aircraft gation 
given position with respect to a V pae "m 
A series of ten navigation problems W^, b 
up based upon ten different position" T 
aircraft with respect to the VOR stato 


je 
amP qt 
problems were selected so as to ? mid 


7 


i ! am 
variety of flight situations in which pe? whe 
be used. Each problem specified 4 ding uo 


> , rea 
the pilot was supposed to solve by T° ithe g 
display. The tasks in general w alon? 
fly to or away from the VOR statio” jt! 
track through the aircraft's present 


Aircraft Instrument Displays 


Or to fly to or away from the station on some 
other designated track as might be required by 
traffic control. To indicate his solution the 
pilot was given a double multiple choice of 
answers. On the first part he had to indicate 
Whether he would turn right, fly straight, or 
turn left to initiate his solution. On the 
Second part he had to choose one of four head- 
Ings to which he would turn if in the first part 
he had decided to turn, or he could indicate 
he would maintain the same heading if he chose 
Rot toturn. For example, if his problem were: 

O approach the station along a track through 
My present position,” his double multiple 
Choice of answers might be: 

North. 

T turn right 90 degrees. 
Ishould fly straight to/on a heading of same as present. 


turn left 180 degrees. 
270 degrees." 


All of the answer choices were in this form. 

€ subject was required to circle the proper 
Words in the first and second parts so as to 
make a complete statement of his solution of 
the problem. 

À set of ten mockups representing the ten 
Problems was drawn for each display. The 
Problem sets were similar for all displays but 
Not identical. The basic spatial relations in a 
Sven problem remained the same for all dis- 
Plays, but bearings, headings, and tracks were 
Systematically varied from one display to the 
next so as to prevent pilots from memorizing 
set Problems as they went from one problem 
e to the next, The order of problems within 

Ach set of ten was different for each display 


as further to discourage learning the 
Problems, j 


Subjects 


dide Problem sets were given to 48 pilots 
p. Med Into three groups as follows: Group 
m Non-instrument pilots with approxi- 
ately 100 hours of flight experience; Group II 
oot commercial pilots with CAA instrument 

Ags, most of whom were also flight in- 
tors; Group III—16 scheduled airline 
s. 


: T fat 
the E authors wish to express their appreciation to 


anq nited Airline pilots who served as subjects 
Fecruiteq 9 A. D. Tuttle and Dr. G. J. Kidera who 
Sessions their Services and arranged for the testing 


125 


Procedure 


In order to introduce the pilot-subjects to 
their task, it was found necessary to prepare 
elaborate written instructions supplemented by 
verbal briefing concerning the displays and the 
types of problems to be solved. A set of 
general instructions covering all displays and 
problem types was issued. These were fol- 
lowed by a specific set of instructions for each 
display. These written instructions were too 
extensive to be included here. However, they 
described as clearly and completely as possible 
the interpretation and use of each part of each 
display and the specific flight procedures which 
would be followed in solving each type of 
problem.‘ 

The problems and procedure were pretested 
using a separate group of eight subjects. Asa 
result of the pretest it was found necessary to 
change some of the mockups because of pre- 
viously undetected errors in their readings. 
Likewise it was found necessary to rewrite 
many parts of the instructions because they 
were not understood by the pre-test subjects. 
In general the instructions had to be amplified. 


Experimental Design 


Since all displays could not be used simul- 
taneously by all pilots, it was necessary to 
arrange the sequences in which they were used 
by different pilots so as to balance the effects of 
practice, transfer and fatigue. Since there 
were eight displays, it was necessary for them 
to be used in eight different serial sequences, 
By virtue of such an arrangement, each display 
could be presented once and only once in each 
serial position while also appearing only once 
in each of the eight serial sequences. Two 
subjects from each group, or six subjects in 
all, were assigned to use the displays in an 
order according to each sequence. 

How well displays could be used by the pilot- 
subjects was measured by the time required by 
a subject to complete each set of ten problems 
and also by the number of incorrect answers 
in each set. A problem was considered in- 

* Complete reproduction of these instructions may 
be found in Williams, A. C., Jr., and Roscoe, S, N. 
Evaluation of aircraft instrument displays for use with 
the omni-directional radio range. Washington, D, C - 


CAA Division of Research, Report No. 84, March 
1949. ; 


126 


correctly solved if either part of the double 
answer were wrong. 


Results 


1. Comparison of the Groups. Before any 
data pertaining to differences among displays 
were analyzed, both the time and error scores 
for the three groups were tested for homo- 
genity by use of the /-test. The three groups 
did not differ significantly in their time scores. 
With respect to error scores the groups were 
different, both of the more experienced groups 
making significantly fewer errors than the non- 
instrument group. Because of this difference 
the three groups will be treated separately in 
the evaluation of the displays. 

The speed and accuracy with which the 
various pilot groups used each display are 
shown in Table 1. The time scores represent 
the average time required to complete each set 
of ten problems. The error scores represent 
the average number of problems incorrectly 
solved per set of ten. 

There was a strong relationship between the 
speed and accuracy with which the three groups 
used the various displays. This is shown by 
the rank order correlation coefficients listed in 
matrix form in Table 2. These correlations 
indicate that displays which were used more 
rapidly also tended to be used more accurately. 
This relationship obtained for all groups of 
subjects, even when the time scores for one 
group were correlated against the error scores 
for a different group. 

2. Comparison of the Displays. Table 1 
shows considerable variability among the time 


A.C. Williams, Jr. and S. N. Roscoe 


and error scores for the different displays. For 
example, using Pictorial A, the airline pilots 
required on the average six minutes to solve 
ten problems, of which only 1.5 were wrong; 
whereas they required 21.1 minutes and made 
4.8 errors per ten problems using the Air pom 
Indicator. Similar ranges were found for al 
three groups. 

As : stultis test to determine |a 
these apparent differences between displey 
were significant enough to warrant more A 
tailed treatment, analyses of variance ku 
made for the raw speed and error scores EC 
16 subjects in each of the three groups. : 
design for these analyses was categorical i 
displays and subjects. Thus, varietee M 
tween displays and variance between Pm. 
were compared with the residual yarian! a 
Since error scores on individual problems y: 
dichotomous, i.e., each problem was - 
either right or wrong, and since time measu 1 
for individual problems were not obtaine® 


are given in Tables 3 to 8, inclusive, whic! 
deposited in ADI (see footnote 2). _ be 

These six analyses show the results p. 
consistent for all three groups. There " (at 
in all cases significant amounts of variance 


the 1% level of confidence) atti 


differences among the displays. Pract! 5 
all of the variance was accounted for in ten 
two sources: (1) differences between disP 
and (2) differences between subjects. , ence? 
On the basis of the indicated differ ps 


j 
among the displays, individual comP? 


Table 1 


Average Time and Error Scores per Set of Ten Problems for Each of the Eight Displays Tested 
(Time scores are in minutes) 


1 

d x s m rms 

was not feasible to categorize the data 1n aye 
of problems. Summaries of the six aT? are 


= pil 
Group I Group IT bee 
Display Time Errors Time Errors Time T 5 

Pictorial A 7 S6 7 
Pictorial B Ws 44 "n T 29 ^ 
Pictorial C 100 — 4g &5 2. 04 5 | 
Radio Magnetic Indicator 14.2 5.8 154 - 14.9 i 
Air Line Indicator 1772 63 is es 159 " 
Air Force Indicator 203 — 66 1 S LI TEE 
Experimental Symbolic 142 6.0 29 33 em n. 
Conventional Symbolic 18.4 6.9 ui e 7 EOM 


Aircraft Instrument Displays 


Table 2 


Matrix of Rank Order Coeflicients Between Time and 
Error Scores for Different Displays by 
the Three Subject Groups 


Variables: T, = time scores for Group I, Er = error 

Scores for Group I, Tit = time scores for Group II, etc. 
Ti Er Tu En Tin Em 

Ty 

Er 97 

Tu 98 — 94 

En .72 83 68 

Tui 99 — 95 9 ø 

Eu 82 /— 83 — 36  .80 76 


Were made by the /-test for correlated means. 
he error scores for each group on each display 
Were pitted against the corresponding scores for 
each of the other displays. A similar treat- 
Ment of the time scores was not made. How- 
ever, similar differences would be expected to 
obtain in both cases, since the time and error 
Scores were known to be highly correlated (as 
shown in Table 2). In any event, accuracy 
Would seem to be a more critical criterion than 
Speed in this particular case. 
he results of these comparisons of the error 
"quencies for the different displays are shown 
Table 9. The displays are listed at the left 
according to their rank order based on error 
Scores. Across the top they are listed in the 
reverse Order. Any given percentagé appear- 
ing in the matrix refers to the level of signifi- 
Cance of the superiority of the display indicated 
to the left over the display indicated above. 
© matrix could be constructed in this way 
Cause there were no reversals in the hierarchy 
ifferences, Although a higher ranking dis- 
AY Was not always significantly better than 
Ri wer ranking displays (see lower part of 
SU rix), a lower ranking display was never 
eee for any group of subjects, to one 
Ing above it. 
Gee 9 reveals the following results. There 
the ç ally no significant differences among 
four ve displays designated as symbolic. In 
Scattered cases the Radio Magnetic Indi- 
tees Was used more accurately than the three 
ge ranking displays. Tn three of these four 
ine this advantage obtained only for the air- 
Pilots who may have had some additional 


e 


127 


experience with some similar type of display, 
for example, the radio compass. 

In general, this matrix emphasizes the di- 
chotomy, discussed earlier, between symbolic 
and pictorial displays. Furthermore, pictorial 
displays should be centered about the station 
(with North at the top) rather than about the 
aircraft, as demonstrated by the superiority of 
Pictorial A over either Pictorial B or C. The 
airplane, rather than the station, should be 
represented by the moving part. 

3. Reliability and Internal Consistency of the 
Test. Inasmuch as the experimental technique 
employed consisted of a paper and pencil test, 
such a test should meet certain standards of 
reliability and internal consistency. 

No direct reliability data were available 
since a single form of the test was given only 
once to any one subject, and since the indi- 
vidual problem sets were hardly long enough 
to employ the split half method, However, 
the reliability of the test may be inferred to 
be high on the basis of the consistency of the 
results for the three independent groups of 
subjects (see especially Tables 1, 2, and 9), 
Furthermore, an item analysis showed the 
test to be internally consistent, since each of 
the ten test items or problems correlated 
acceptably with the total test error scores for 
each of the three groups. On the average the 
individual items correlated about .60 with the 
total test. (Time scores were not obtained 
for the individual items.) 

4. Dependence between Problems and Dis- 
plays. It would be of interest to determine 
whether some instruments are relatively more 
effective than others in working certain types 
of problems while being less effective for others, 
Le., whether there was any “interaction” be- 
tween instruments and problems. Unfortu- 
nately, however, the data are not Such as to 
yield a direct answer to this question. Such 
interaction could not readily be evaluated by 
analysis of variance procedures for reasons 
stated previously, namely, error Scores for in- 
dividual problems were dichotomous and time 
scores for individual problems were not ob- 
tained. Nor were the assumptions required 
for the use of certain other statistics, such as 
chi-square, completely justified, Inspection of 
the raw error data suggested, however, that 


128 A.C. Williams, Jr. and S. N. Roscoe 


Table 9 


ences in Error Scores for Groups I, II and III on the Different Displays 


Matrix of Significant Differ 
of the display indicated at the left over the 


Percentages indicate the level of significance of the superiority n d at the 1 
display indicated at the top, as determined by the /-test for correlated means. The single division line emphasizes 
the dichotomy between the “pictorial” and the “symbolic” displays. The double division line emphasizes the 
superiority of the station centered pictorial display over all others, including the airplane centered pictorials. 


significance of the discrepancies between the 


Composite 
Error i 
Rank Display Group CSI ESI AFL ALI RMI Pic€ PicB Pic A 
I 1% 1% 1% 1% 1% 2% % 
1 Pic A u 1% — 49 A% A% AN 
III 1% 1% 1% .196 1% 5% = CNET 
I 1% 5% 1% 2% 2% = 
2 PicB u 2% 1% = 5% 1% B 
1I 1% 96 1% 196 2% = 
I 1% = 1% 2% s 
3 — PicC Hn 5% 5 = 5% - 
HD 1% 1% 1% 1% oa 
I 5% = = i 7 p E 
4 R.M.I H = = cae em 
II 5% 1% 2% "es 
I = > - 
5 A.L.I ju = sia c 
III == — = 
I — — 
6 A.F.I. II = — 
ii — = 
I — 
7 ESI I — 
Ill = 
8 CAI 
t ; 7 ndai i i 
jer = marked trends indicating de- experimental task and the flight task. lot * 
pendence between problems and displays. use of any such instrument display, the P™ e 
Since a wide variety of possible flight situations — task involves tw ah formance gE 
was sampled by the ten problems, this apparent of discrimi si ie p jm ilot i 
lack of interaction is encouraging in that it q vp a nanon OR which tie E atio" " 
further justifies making generalizations con- dm IS Hid ici api memp 
cerning the relative overall effectiveness of the eh vs must execute: Nis Fin ; get 4 
various displays for air navigation. In this experiment, it was possible " dr 
only the first aspect of the pilot's task, D^ y in 
Discussion of Results his decision concerning which way K pant 
Within the scope of this experiment the order to solve the navigation problem ? cul 
puke have shown that pictorial type VOR For this purpose it is believed that WC o 
Mr = be used to solve navigation prob- ĉe Pertinent to the flight situation- for; 
m 
isplay: i i 
results are applicable to i Pe i pes woes Vai apnd T ey 
pendsuponthe in flight by watching the trend of the so 


ment readings. But this handicaP 


Aircraft Instrument Displays 


apply equallyeto all displays and would prob- 
ably not affect their relative standing. 

A second disadvantage of mockups is that 
the subject could not manipulate the instru- 
ment so as, for example, to change a “to” 
Course to a “from” course or vice versa, as 
could be done with some of the actual instru- 
ments. At the same time the opportunity to 
manipulate an instrument is also an oppor- 
tunity to make an error in manipulation as 
has been found to be the case in a Link trainer 
equipped with VOR. Even though the in- 
ability to manipulate the mockup might 
handicap some of the symbolic displays, the 
Same handicap existed for the pictorial dis- 
Plays in the case of their track selector which 
could not be manipulated on the mockup but 
could on the real instrument. For this reason 
It is doubtful that lack of manipulation dis- 
criminated unfairly between the various types 
of displays. On the whole it is the opinion of 
the investigators that so far as the pilot's 
Mitial task of orientation and discrimination 
'S concerned the results of this experiment 
Would recur in the actual flight situation. 

After deciding what to do as a result of 
reading and interpreting a display, the pilot 
Must then execute his decision by flying the air- 
Plane accordingly. To do this the pilot uses 
additional instruments such as the artificial 
orizon, altimeter, etc. But he also uses the 
tion display as a source of direction informa- 

along with the compass and directional 
Em. When used in this way the display can 
risa to be a flight instrument and, 
he on m the case of other flight instruments, 
aligu ot’s task then becomes chiefly one of 
pilot ment. Using the aircraft controls the 
align must manipulate the airplane so as to 
Drevi a moving indicator of some sort with a 
isplay determined fixed reference on the 
b: ien Sort of performance cannot be evaluated 
control s: static mockups because there are no 
Mmockun, and no moving parts involved in the 
supposs It would be unfair and unwise to 
Could b that the results of this experiment 
dispj,, ^, Sed to infer the adequacy of these 
To 435 n their capacity as flight instruments. 
heres aty there is reason to believe on 
bath ear i grounds that a more accurate flight 
Þe achieved using some of the sym- 


a 


129 


bolic displays than by using the pictorials. 
Symbolic displays using the course line devia- 
tion indicator are more sensitive to displace- 
ment from the selected track than are the 
pictorial displays. Both the Air Force Indi- 
cator and the Experimental Symbolic Indicator 
were designed so that an asymptotic approach 
to a selected track could be achieved by main- 
taining a simple alignment on the face of the 
display. Presumably the ease of accomplish- 
ing this performance using these displays 
would not be surpassed using pictorial displays. 
Certainly an evaluation based on mockups is 
not relevant in this respect. 

On the whole it is accurate to say that this 
experiment has thrown no light on the ade- 
quacy of the displays when they are used as 
flight instruments. There is reason to suspect 
that those displays which were found in the 
experiment to be superior would not necessarily 
be superior when used in this manner. 

It must be remembered, however, that VOR 
is primarily a navigation device. Its chief 
purpose is to supply the pilot with information 
from which he can orient himself and from 
which he can decide on a proper direction in 
which to fly. If the VOR display fails in this 
respect so that the pilot makes wrong decisions, 
then whatever excellence it might have as a 
flight instrument is wasted. If it were a case 
of the pilot doing the wrong thing very well as 
opposed to, perhaps, doing the right thing 
passably well, the latter alternative is to be 
preferred. This experiment has contributed 
evidence bearing on the inclination of pilots 
to make correct or incorrect decisions as a result 
of using different kinds of VOR displays. For 
this reason it is felt that the experiment is 
pertinent to the problem as it exists in flight, 
and that the experimental results should be 
considered when making a choice of displays 
for actual use. 


Summary 


The speed and accuracy with which 48 pilots 
could use mockups of eight different VOR air- 
craft instrument displays to solve typical navi- 
gation problems were measured. The pilot 
group was composed of 16 non-instrument 
pilots, 16 commercial pilots with instrument 
rating, and 16 scheduled airline pilots. The 
instrument and airline pilots made fewer errors 


130 


than the non-instrument pilots but there was 
no significant difference in time scores. The 
rank order of displays based on error scores 
was highly correlated with their rank order 
based on time scores both within each pilot 
group and between groups. 

With respect to both time and error scores 
there were significant differences between dis- 
plays, similar differences being found for all 
groups. So-called pictorial displays which pre- 
sented information in terms of a graphic 
representation of the actual spatial relations 
involved were significantly superior to the so- 


A. C. Williams, Jr. and S. N. Roscoe 


called symbolic displays which presented infor- 
mation in terms of dial readings, needle deflec- 
tions and numbers. One pictorial display was 
superior to all other displays. 

The reliability of the techniques used was 
inferred to be adequate from measures of in- 
ternal consistency and from the similarity © 
results obtained from independent groups: 
Inspection of the data suggested that nO 
marked trends indicating dependence between 
problems and displays were evident. 


Received July 8, 1949, 


The Prediction of Persistency in Premium Payment 


S. Rains Wallace, Jr., and Alfred G. Whitney 


Life Insurance Agency Management Association, Hartford, Connecticut 


This is a report of an attempt to develop 
methods for predicting life insurance premium 
Paying behavior. While the research was ini- 
tiated for a purely practical purpose, it offers 
findings of interest to the academic psychol- 
gist, since it demonstrates that better than 
chance prediction of a response which has a 
quite real social and economic meaning to the 
Individual can be made over a period of five 
years on the basis of some simple and írag- 
mentary background data. 


The Problem 


Most life insurance companies (the com- 
monly held belief to the contrary) lose money 
1t à policyholder fails to continue his premium 
Payments for at least two or three years. 

ven subsequent terminations of the policy 
are costly to the individual agents and agency 
Managers, and it can be demonstrated that a 
'gh policy termination rate in a company, 
Other factors being equal, results in an un- 
Poms financial condition. Furthermore, the 

"Mpanies are not unaware that the problem 
ws „deeper sociological implications. The 

Mination of a policy almost always means 
OSS to the policyholder, both financial and 
pi! chologicai It represents a failure in a 
im a for security and, to some degree, an 
aion that the company has been inade- 
a. di in meeting the problems and needs of 
Mie UR For these reasons, the companies 
With ager to find factors which are associated 
Which sistency in premium payment and 
agent, could be employed in training their 
cliente how to identify potentially persistent 

ee OF course, efforts are also made by 
sol Companies to conserve policies already 
that the a previous study (3) has indicated 
scepte tency ofa policy is much more 
time of - to the factors which occur at the 
Quent sale than it is to efforts made subse- 

Slice to keep it in force. . . 
Make Considerations of selling technique 

the close catechization of a potential 


131 


buyer something less than desirable, it was 
necessary to limit the study to the information 
which is routinely obtained in the standard 
policy application form. Furthermore, some 
of this information is almost certainly un- 
reliable (e.g., income, occupation, etc.) since 
it is based upon estimates or classifications 
made by the agents rather than actual records 
or the client's report. The problem, then, con- 
sists in determining methods for weighting 
these items of information in order to obtain 
the maximum degree of accuracy in predicting 
the eventual fate, i.e., the future persistency, 
of a policy at the time it is applied for. 


Methods and Procedures 


The Sample. The basic data for the study 
were first gathered in 1942 (2). Fifty-two 
life insurance companies supplied information 
on 12,499 Ordinary Life policies issued in May, 
1942. Each company was assigned a quota 
of policies on the basis of the amount of new 
life insurance sold in 1940, selected in ac- 
cordance with random sampling procedures. 
Various checks on certain aspects of the sample 
have supported the hypothesis that it may be 
regarded as typical of all Ordinary life in- 
surance sold during May, 1942, 

Five years later, the companies provided in- 
formation on the subsequent history of almost 
all of these policies. One hundred twenty- 
seven of the cases were lost (every company 
contributed to the follow-up and the lost cases 
were distributed throughout the entire sample) 
so that the sample was reduced to 12,372. Of 
these, 1861 policies sold on the lives of persons 
under the age of 15 (so called "Juvenile" 
insurance) were removed. Among the adult 
policies, 1243 were refused by the client, thus 
producing a still-born transaction, while 154 
cases were terminated by death, expiry, or 
maturation. These cases were also removed 
from the sample. The remaining 9114 cases 
included 3898 cases sold to men and 1245 cases 


132 S. Rains Wallace, Jr 
sold to women by Ordinary agents. There 
were 3971 cases sold by Combination agents. 

For the purposes of this paper, only the study 
of the 5143 policies sold by Ordinary agents 
will be reported. However, it may be stated 
that similar methods were applied to the other 
group and similar results were obtained. 

The Data. For the original sample, each 
company provided the following information 
for each case: (1) the insured's annual income 
(usually as estimated by the agent); (2) the 
amount of the policy; (3) whether the insured 
underwent a medical examination; (4) the type 
of policy (whole life, endowment, term, etc.); 
(5) the manner of premium payment chosen 
by the client (monthly, quarterly, semiannu- 
ally, or annually); (6) the beneficiary (wife, 
children, parent, business partner, etc.); (7) 
the sex of the insured; (8) the age of the insured; 
(9) the marital status of the insured; (10) the 
occupation of the insured; (11) the amount of 
life insurance owned by the insured; and (12) 
the company or companies in which life in- 
surance was owned. » 

For the follow-up sample, each company 
provided the following information for each 
case: (1) whether cash was received with the 
application; (2) whether the policy was issued 
as applied for or at some other rate, type, or 
amount; (3) the number of fractional years' 
premiums paid as of July, 1947; and (4) the 
status of the policy as of July, 1947 (in force, 
lapsed, surrendered, death, expiry, matura- 
tion, etc.). 

The Analysis. Preliminary inspection of the 
data for males indicated that students should 
be treated as a special group. The elimination 
of them, in addition to those not gainfully 
employed and those who made single premium 
payments, reduced the sample to 3448. 

For this remainder of 3448, a list of variables 
was studied and persistency rates 


(Persistency Rate 


number of policies in force 
no. of pol. in force + lapses + vol. term. 


were calculated for various sub 
Summary of the chief relationships 
discovered is as follows. 

toward a higher degree o 


groups. A 
which were 
There is a tendency 
f persistency among 


. and Alfred G. Whitney 


those: (1) who have higher incomes; (2) who 


choose to make annual premium payment, 
(3) who are older; (4) whose policies are S - 
on an examined basis; (5) who previously 
owned some life insurance; (6) who buy o- 
other than modified life or term policies; n 
who are proprietors, executives, or prolem 
workers; (8) who buy policies of larger amoun™: 
and (9) who are married. . 
Other variables which were studied 
which exhibited a lesser degree of relations Ae 
are indicated below. There is a agus c 
ward a higher degree of persistency po ot 
those: (1) whose beneficiary is the w! 
children; (2) who pay cash with 


and 


the applic? 
. n *eenad as SP" 

tion; and (3) whose policy is issued a 
plied for. . z this 
The calculated persistency rates i to 
group are shown in Table 1. With [o3 than 

some variables the total is slightly les 

3448 because of missing data. 
Since all of the factors found to | 


eS, 
: : ‘on unrving deere 
persistency are interrelated in varyIng “79, 


o M 

“weighting 
the problem becomes one of “w orn fi 
other words, of determining a me to give 


combining nine variables in a — ene ; 
the highest possible predictive effec ariables 
If all of the divisions of the nine VO". 
could be accurately expressed on per 
scales, this problem of weighting Cp nigue" 
solved by standard statistical teci most 
Since, however, it is difficult to quant ach 
of the variables, the relationship betw? 
pair cannot be expressed simply. ig 
All of the two-way relationships ¢ efor’ 


" à S eis ;er "mud 
nine factors listed in Table 1 were ` well 


studied and tables indicating their n : 
prepared. Since these relationships ©" pe í 
be readily quantified, it was necessary. ish ^ 
to the procedure of continuous sub rsis 
the groups and finally to study the eo 
of individuals having certain patter „aie 
nine factors. — poli 
For the purpose of this subdividing» di 
bought on a salary savings basis W sper 
nated, since it was believed that the © 
from 1942 through 1947 with. eat i 
would not be applicable to conditio nad? 
The initial sorting of the data W? ine pe 
! The interested reader who wishes 19 ehe n 


relationships in detail is invited to write 
OF copies of these tables. 


Prediction of Persistency in Premium Payment 


the factors, income and mode of premium 
Payment. An analysis made at that point 
showed that within income and mode of pre- 
mium payment groups, the predictive power 
of amount of policy and marital status dis- 
appeared. Accordingly, these two variables 
Were eliminated from further consideration. 
The seven remaining variables were each 
divided into a number of categories as shown 
in Table 2, 

The total number of possible patterns under 
Such a system of classification would be the 
Product of the numbers of categories,— 
AXAX2X 3X 2(4x(6— 4608. 


Table 1 
lation Between Certain Factors and Persistency: 


Sales to Adult Males (excluding students) 
by Ordinary Agents 


Rel 


Persis- 
Number tency 
of Rate 
Variable Cases (%) 
Income 
$5000 and over 415 81 
83000-4999 590 75 
$1500-g2999 1739 61 
Under $1500 704 49 
SHS 61% 
Mode of Premium Payment 
Annually 1250 75 
3 emiannually 504 65 
Quarterly 1077 57 
Monthly 318 55 
Salary savings 209 37 
; 3u8 — 6 
Medical Basis s 7b 
“xamined 2400 70 
‘onmedical 1038 50 
Prey; 3438 64% 
m tous Ownership of Insur. 
^ Same company 758 77 
N some other company only 1445 66 
“one 1199 53 
Ty 3 64 
Ty Pe of Policy = in 
Vhole life 1009 66 
p. dowment and retirement income 699 65 
amily income 318 65 
pated payment life 879 64 
9dified life and term 529 56 
3434 64% 


133 
Table 1—Continued 
Persis- 
Number — tency 
of Rate 
Variable Cases (95) 
Age 
45 and over . 474 78 
35 to 44 818 71 
25 to 34 1376 65 
Under 25 770 5 
3438 oc 
Occupation um ui 
Executives, proprietors, 
professional workers 665 82 
Semi-executive and semi- 
professional workers 482 72 
Agriculture and ranching 377 69 
Clerical office workers 329 64 
Sales clerks and salesmen 329 58 
Factory and mine employees, 
skilled and unskilled labor, 
armed forces 1256 50 
3438 6105 
Amount of policy 
$6750 and over 363 75 
$3250-$6749 734 69 
$2750-$3249 173 63 
$2250-$2749 368 61 
$1750-$2249 488 60 
$1250-$1749 235 57 
Under $1250 1077 60 
Marital Status n 6476 
Married 2656 67 
Single 731 53 
Widowed or divorced 51 59 
3438 6407 


It was decided to make the final sort an 
occupation because it was believed that the 
data on this variable were probably the least 
reliable. Accordingly, by the time this sort 
was made, it was frequently found that much 
of the predictiveness of occupation had disap- 
peared so that, at most, a classification into 
only two or three categories was required. 
For other factors, too, preliminary analysis 
often disclosed lessening of their influence at 
various points. For example, the differences 
in persistency rates among semiannual, quar- 
terly, and monthly business disappeared in 
the highest income group and combination of 


134 


Table 2 


Initial Variables and Categories 


Number of 


Variable Categories 


Income 

Mode of premium payment 
Medical basis 

Previous ownership of insurance 
Type of policy 

Age 

Occupation 


AORN WN Ro 


the data for these three modes of payment was 
possible. By such combinations, the total 
number of patterns was reduced from 4608 to a 
workable number with sufficient cases for 
study in each pattern. 

In each of the separate patterns, the per 
cent of policies in force was calculated and the 
per cents were then converted to a single digit 
numerical rating using the equivalents shown 
in Table 3. 

The data were now examined to determine 
fundamental trends and where inconsistencies 
in these trends could be ascribed to chance, the 
data were smoothed in accordance with the 
trend. A description of the method of per- 
forming this smoothing can be obtained by 
request to the authors. 


Results 


This analysis, then, resulted in a Persistency 
Rater with which a score could be assigned to 
each case? The cases in the study were then 
scored and the actual persistency determined 
for each rating group with the results shown in 
Table 4. The biserial coefficient of correlation 
between the Persistency Rating and the actual 
five year persistency is shown for each group. 

It must be remembered, of course, that such 
“validation” of the Persistency Rater is fal- 
lacious since it is based upon the same cases 
with which the weights were determined. 
Some shrinkage should be expected when the 
Rater is applied to another sample. However, 
_ 7A correction was made in the Rater for the changes 
in income which have occurred since 1942. This cor- 
rection was based upon comparative frequency distri- 
butions of incomes among consumer units in 1942 and 


1948 and upon the comparative values of the dollar in 
1942 and 1949. The Rater, which is now being used 


by a number of companies, may be procured upon 
request to the authors. 


S. Rains Wallace, Jr. and Alfred G. Whitney 


the writers expect such shrinkage to be slight 
in light of the objective nature of the factors 
used. This expectancy has been considerably 
reinforced by the results of a recent study made 
by one life insurance company. 

"This company selected, at random, 489 cases 
sold in the first six months of 1946 and studied 
their persistency for a two-year period. T 
sample is not strictly comparable to the large 
one here reported since it comes from a 
different year and since the over-all persistency 
of the group is considerably higher (82% =. 
stead of 66%). However, when these case 
were scored on the Persistency Rater, the results, 
as shown in Table 5, indicated a very satis- 
factory level of validity. 


Discussion 


It is not believed necessary to labor the point 
that the discontinuance of life insurance ee 
ments is a response which has many imp aly 
tions for the individual in question. E 
apparent would seem to be the possibility de 
such a response might have considerab A the 
nificance to the psychologist, particularly ” he 
social field. While the motivation behini ysur" 
buying and subsequent payment of an a e 
ance policy is certainly something less than e a5 
or simple, it does appear to involve ware 
pects of “altruism,” of “social responsib! m 
and of “economic stability." This study have 
onstrates that the response itself must ct 
considerable reliability even despite las 
that we may identify numerous factors ". 


A n » jn' 
might affect it and which are largely 


pic 


Table 3 


; , and 
Conversion Table for Persistency Rates 


Numerical Ratings ae 


Per Cent Numerical 
in Force Rating 
100 9 

90-99 8 
80-89 7 
70-79 6 
58-69 5 
48-57 4 
38-47 3 
28-37 2 
18-27 1 

Under 18 De 


| 


| 


Prediction of Persistency in Premium Payment 135 


Table 4 


Proportion of Placed Policies Which Remained in Force for Five Years and Persistency Rating 


All Men Except 


Students Male Students Women 
Persistency Per Cent Per Cent Per Cent 
Rating No. in Force No. in Force No. in Force 
7 and over 808 85% 155 82% 483 86% 
6 781 74 104 70 452 75 
5 736 62 5 20 91 62 
4 423 52 0 = 73 52 
3 and under 461 35 [U = 16 38 
Total 3209 66% 264 76% 1115 7195 
ros = 46 Tois = 30 Tbis = .36 


Pendent of the individual (economic readjust- 
ments, changes in marital status, job changes, 
etc.). "The response would, therefore, appear 
to offer considerable promise of providing a 
Criterion for investigations in the fields of 
Personality or social psychology. While many 
of the factors used in predicting the response 
may be regarded as reflecting the transaction 
Tather than the buyer’s characteristics, they 
heed not be so interpreted. For example, the 
act that policies paid for on an annual basis 
are more persistent may indicate the superi- 
ority of this method of payment, but they may 
indicate, instead, that persons who are capable 
a 9r who choose such a method are more 
l ely to make the desired response. The 
atter hypothesis gains some measure of support 
rom the fact that persons who change from 
an annual premium payment to a more fre- 
Quent one have almost as high a persistency as 


Table 5 


P " —" 
"'Oportion of Placed Policies Which Remained in Force 
for Two Years and Persistency Rating 


Sales by One Company to All Men Except Students 


ersistency Per Cent 
ating Number in Force 
8 68 100% 
7 142 91 
$ 103 81 
2 78 75 
$ 64 71 
and under 34 50 
Total 489 82% 
Tris = 47 


those who continue the annual mode of pay- 
ment (1). Here, apparently, the interest of 
the individual and willingness to go through 
the process of changing his mode of payment 
outweighs the fact that he has more frequent 
opportunities to make the undesired response. 

A factor which could not be included in this 
study but which other studies (3) have shown 
to have considerable influence is the agent who 
makes the sale. It is known that policies sold 
by some agents have a consistently high per- 
sistency, while those sold by others are con- 
sistently low. While it seems probable that 
some of this effect is related to the agent's 
selection of clients, it appears that some 
characteristics of the relationship established 
between the agent and his client may also be 
important determiners of the client's future 
responses. It is in this field that the prospects 
for improving accuracy of prediction and for 
enhancing our understanding of the deter- 
mining factors would appear to lie. Studies 
in this direction are now in process. 
Received January 16, 1950. 

Early publication. 


References 


1. Life Insurance Agency Management Association. 
Persistency 1942-1947. Hartford, Conn.: Life 
Insurance Agency Management Assoc., 1949 
pp. 123. : 

2. Life Insurance Sales Research Bureau. The 1942 
Buyer. Hartford, Conn.: Life Insurance Sales 
Research Bureau, 1942, pp. 79. 

3. Wallace, S. R., Jr., and Twichell, C. M. Factors 
affecting the persistency of “Orphan Business.” 
J. Amer. Soc. Chart. Life Under., 1948, 2 398- 
408 i 


Book Reviews 


Super, Donald E. Appraising vocational fit- 
ness. New York: Harper and Brothers, 
1949. Pp. xiiit+727. $6.00. 

The primary purpose of the author is to 
provide an objective and detailed evaluation 
of a number of the most important tests em- 

' ployed by vocational psychologists. This has 

been realized in a series of chapters which are 
thorough, straightforward, and with few ex- 
ceptions free from ambiguity. The author has 
summarized for each test the most significant 
research, discussed it in terms of current theory 
and practice in mental testing, and evaluated 
it on the basis of its use and reputation in 
educational guidance, consultation clinics, and 
business and industry. The work is scholarly. 

A second objective of the book is to provide 
the reader with work habits and thought proc- 
cesses which may enable him to evaluate new 
testing instruments and new research... The 
realization of the second objective does not 
appear to be as complete as the realization of 
the primary objective. Although Dr. Super 
has certainly not limited himself to the cata- 
loguing of tests and their essential statistics in 
the manner which is so prevalent among texts 
in this field, he does not make explicit many 
of the important concepts and psychological 
assumptions which are implicit in the evalua- 
tion of human behavior. He tends to avoid 
the discussion of controversial topics and does 
not strive to illuminate for the student the 
broad psychological basis for and general 
relevance of the samples of behavior elicited by 
tests. His discussion of the basic psycho! 
significance of some of the test classifi 
is weak and his appendix on the fund 
quantitative principles of mental 
ment is not suffic 
Super’s work is crit 
discussion 
sources. 


logical 
cations 
amental 
measure- 
ient to its purpose. Dr. 
ical and intelligent but his 
is no more profound than his 


„The implications of a test score are usually 
discussed in terms of an average relation with 
some isolated criterion and (from the stand- 
point of most clinicians) with insufficient re- 
gard consistently manifested for such variable 
factors as peculiarities of the individual's moti- 
vational pattern, his characteristic reaction to 


136 


competition and the probable future nen 
ment or impoverishment of his arene 
Dr. Super’s insightful and stimulating CIS 
cussion of these considerations is familiar n 
those who have read his well-known mene 
of Vocational Adjustment. In his new booa 
however, Dr Super is primarily concerned a 
the portion of the total test variance which m 
lated to some known criterion. ‘The poss! t 
determiners of the residual variance e 
which makes for errors in predictions) rece? 
relatively little attention. The author's Pli 
lege to make this differential emphasis $ sb 
require no justification and would drav s 0 
comment had he not in the closing chapter A 
his book discussed the techniques of be 
tional guidance and the conversational d a 
whereby tests results are to be presente 
the individual. . 
The book's greatest value lies in its C 
delineation of the most important respec em- 
which a test may be used. The primis oat 
phasis is on the use of tests. Only puc 
obvious or experimentally illustrated Adi 
of tests are emphasized; the subtle c E 
tions under which tests may be misuse’ i 
not strongly illuminated. In presenting ar 
sources, the writer’s usual critical disce and 
was on a very few occasions in abeyanc?i. 
although the work is not free from n thé 
errors, these are few and do not weake 
book. rises a 
In general, Dr. Super’s book comp he hi? 
thorough adherence to the literature anc sort? 
presented its implications for many MP” ing 
uses of tests in an expert manner. p le 
Vocational Fitness will be a very ent? 
required text for courses in measure pal 
individual differences, guidance, and vo“ 
Psychology, 


areful 
ts in 


p 
J- R wittenb?" 
Yale University d in 
fm ug 
Bellows, Roger M. Psychology of P M 
usiness and industry, New York: 4.50. 
Hall, Inc., 1949, Pp.xi-499. $* "rU 
gS" we 
This book by an industrial psycholog p 
tempts to “clarify the boundaries jg f0 
Personnel methods and management 


Book Reviews 


the Socio-psychological point of view." It has 
been written “for those who are interested in 
improving personnel management by use of 
personnel systems and procedures." 

Part I—Development of Personnel Tech- 
nology defines the goals of personnel manage- 
ment and describes historical influences leading 
to modern personnel methods. Part II— Tools 
lor Effective Use of Personnel includes separate 
Chapters on most of the standard topics found 
m personnel management texts, viz., criteria, 
job analysis, recruitment, interviewing, testing, 
training, job evaluation, incentives, merit 
rating, and turnover. Part III— Worker Sat- 
'slaction Through Human Understanding de- 

nes the field of “industrial social psychology” 
and discusses the need for and techniques used 
™ employee counseling, communication among 
employee levels, attitude surveys, and sugges- 
tion systems. Part IV—Implications of Per- 
Sonnel Technology describes the rising pro- 
fessionalization and specialization of personnel 
management and the need for trained leaders 
and summarizes current trends in personnel 
Practices as influenced by research in the field. 
Appendix A presents lists of publishers, organ- 
izations, Publications, and schools as sources 
of additional information. Appendix B pre- 
Sents the Taylor-Russell tables. Each chapter 
ends with a list of selected references. 
s the commonly-referred-to studies and 
ehh literature on the topics discussed are 
Y well covered. Consideration of the 
resoan element is stressed throughout and due 
Bnition is made of the importance of the 


Soc : 2 s E " HH 
tig mfluences impinging upon the indivi- 

S Worker. Topics frequently neglected 
e 


pointed up, as evidenced by separate 
Apters devoted to recruitment, employee 
tia systems, and methods of improving 
munication between management and em- 
reis The need for basic research and for 
Derso ering the interaction among the various 
cui. "nel techniques is emphasized through- 
to tier at personnel psychology is on its = 
reader ming a profession is evident to the 
ten alortunately, the essentially good “con- 
Effects, the book is not presented inan equally 
ant » manner, The writing is redun- 
Someti autological, uneven in difficulty, and 
mes even ungrammatical. Words and 


ch 
Su, 


137 


phrases are frequently so loosely used as to 
become almost meaningless. Simple, painfully 
obvious comments are couched in involved 
sentences. Even straightforward statements 
are frequently unnecessarily qualified. The 
following are selected from literally scores of 
possible examples: 


“Since Pile A represented the most favorable 
attitude toward the church, this represents 


a fairly favorable attitude toward the 
church." (p.376) 

“. . . the job-analysis approach can be use- 
ful only if job analyses are made and. , . .” 


(p. 426) 

“Managements, in considering whether non- 
directive counseling is profitable, are con- 
cerned with the problem of evaluating it." 


(p. 323) 
"Systematic use of... are often over- 
looked." (p. 159) 


"Accept his feelings and attitudes and re- 
flect it." (p.320) 

“. . . qualifications . . . is...” (p, 326) 
“The non-directive approach is probably 
best adapted to the severity of the malad- 
justment.” (p. 323) 

“. . . success in studying and becoming an 


accountant.” (p. 140) 
“The sample had been picked . . . into one 
of five major criterion groups.” (p. 153) 


“It is a project of some size to arrange for 
all employees to meet in a single room, par- 
ticularly in large companies.” (p. 333) 
“Luncheon menus . . . might best be placed 
in the restaurant or in the hallways leading 
to the restaurant. . . . Departmental in. 
formation might be placed in the hallways 
that are used by employees in a given de- 
partment." (pp. 337-338) 

“. . . differences . . . has." (p. 351) 
"Indirect media for sharing information con- 
sist of results as a by-product of the several 
personnel methods that are discussed in this 
book." (p. 345) 

“This way of doing does not work so com- 
placently in industry." (p. 437) 

"[This test terminology] was found useful 
in designating the different types of predictor 
devices since in this program considerable 
value was found in the use of non-test pre- 
dictors, which were Systematically used in 


138 


the same batteries as test predictors, when 
found to increase the validities of such 
batteries." (p. 156) : 

“A list of 769 words prepared by a professor 
named Dale... ." (p.355) 


Possibly the author was trying to "write 
down" the material to a lower level of under- 
standing than this reviewer assumes necessary. 
Perhaps he himself was trying to apply the 
Flesch techniques which he (rightly) recom- 
mends for use in employee communications. 
If so, this reviewer appreciates the effort but 
still considers the level of presentation as 
vague and wordy rather than clear and precise. 

'The usual minor errors and omissions found 
in first editions are present. Figure 1 is in- 
completely labelled. In Figure 25 the term 
"applicants" should be "employees." On 
page 132, "selection ratio" is improperly used, 
although defined accurately in Appendix B. 
Appendix B, incidentally, is referred to in the 
text merely in a footnote. Figure 42 is an 
essentially meaningless figure purporting to 
show “The Three Degrees" of social distance 
between worker and boss. Table 29 is pre- 
sented with no reference to it in the text. 
Table 40 should be numbered Table 39. The 
format of Appendix A is such that it is quite 
difficult to separate one item from another in 
the long lists presented. Many of the “Se- 
lected References” at the end of the chapters 
do not indicate the relevant section of the 
reference. The addresses of several of the 
journals listed in Appendix A are incorrect or 
out-of-date. 

It is easy for reviewers to point out omissions 
of content. As the field of personnel psychol- 
ogy expands, authors are being forced to 
select what materials to include. This re- 
viewer was disappointed, however, in not find- 
ing reference to British and French industrial 
psychologists, to the use of experimental groups 
hired without regard to test scores as an im- 
portant step in test validation research, to the 
work of the Civil Service Commission on job 
Classification, to the effect of the use of 3 
ployment tests on quality of applicant, to the 
types of tests which have been found usef! i 
in predicting job success, to the T. W.I Š 
gram of training supervisors, F 


of tr to the use of non- 
financial incentives, to data on voluntary ES 
Striction of output by workers, and to the 


Book Reviews 


Labor-Management Committees of W orld 
War II. In addition, although scattered ref- 
erences are made, there is no systematic dis- 
cussion of accidents and their prevention, of 
monotony, of music in industry, of fatigue and 
efficiency. ) 

This reviewer's over-all judgment is that i 
author has a potentially good book at hen 
but that it needs a careful going-over a 
condensing before it is worth much more e 
a casual reading of Chapter V. Attracting 
Personnel, Chapter XVI. Techniques ee 
proving Communications, and Chapter XVID- 
Employee Suggestion Systems. 


Albert S. Thompson 
Teachers College, 
Columbia University 


" 3 -hile 
Prentice-Hall, Inc. The new cure for X 


collar unrest, 1948. Pp. 48. 


a = : „pitten in 
This forty-eight page pamphlet is writ E 
the breezy style typical of Prentice-Hall pe 7 
and services for businessmen. It dont h 

introduction and four sections titled: " 
makes white collar workers tick, why selv 
collar workers kick, how unions sell then? ni 
to white collar workers, and what emp fi of 
can do for white collar workers. Over ™ pes? 
the booklet is devoted to the last ? 
sections and covers such diverse mate on" 
merit rating, suggestion systems, puo. suc 
sideration, company rules, earmarks © jensio” 
cessful supervisor, job evaluation, p ral? 
plans, getting information to employ eC -ptio 
surveys, etc. Inasmuch as this item any 
includes only a small number of -—Ó 
items covered, it is obvious that UP iot 
must be at a superficial level. This T “gt 
could find nothing to warrant inclusion ©; ney 
cure” in the title. Although the book'e tat 
free from factual error and unsupPo" le eve’ 
ments, it is basically accurate at a SI?! ext 
It might thus be of value to the “busy am o 
tive” who wishes to cover a gor T 
subject matter in a minimum of p pet 
pamphlet obviously is not intended ie gal | 
nel psychologists or professionally ctor” 
personnel and industrial relations 017 


p 
ens? 
Clifford E. Ju”? 


Minneapolis Gas Company 


Book Reviews 139 


Gilbert, Jeanne G., and Weitz, Robert D. 
Psychology for the profession of nursing. 
New Vork: The Ronald Press Company, 
1949, Pp.x4-275. $5.00. 


The authors state that this textbook has 
been prepared specifically for student nurses. 
The purpose is to present principles of psy- 
chology in such a way that the nurse can make 
use of them in understanding the patient and 
in making a successful personal adjustment to 
life and its problems. 

The book is divided into three parts. The 
€mphasis in Part I, Fundamental Principles of 

Sychology, is upon individual human behavior 
and the factors which determine its develop- 
ment. In Part II, Personality, Mental Hy- 
Slene and the Normal Patient, the integration 
Of all traits in the development of personality 
'S discussed; and special problems which have 
to do with the care of children, aged, and 
“ironic and convalescent patients are pre- 
Sented. Part III, Personality Maladjustments 
and the Abnormal Patient, includes informa- 
tion about psychopaths and other wayward 
'ypes, behavior disorders in feebleminded and 
Organically diseased patients, the psychoneu- 
roses and the psychoses. Case studies are 
Presented to illustrate various types of ab- 

Ormalities, The final chapter includes a brief 
Presentation of diagnostic procedures and thera- 
Peutic techniques. 
N writing this particular text the authors 
thee Probably attempted to meet the needs of 
ta Schools of nursing which, up to the 
Ry cnt time, have tried to cover the entire 
astal psychology in an introductory course 
yeap > BiVen to students during their first 
fee m the school. It is not adapted to the 
Sent. of schools which are attempting to inte- 
Subje Psychological principles into every clinical 

" ct Which is taught, and which, in addition, 
for te experience in psychiatric nursing 

kids of their students. The material on 
tion a Principles of psychology, integra- 
therapy „Personality, diagnostic tests and 
Beg. e techniques; which should probably 
Day et emphasis in a basic course in 
Questio °8y; is so briefly presented as to be of 
fi nable value unless supplemented with 
"Es Mm other texts and periodicals. Yet, 

€rences at the end of each chapter in- 
© Only the names of entire books and 


le 


periodicals, Without more specific references 
to guide her, the student in nursing is unlikely 
to do extensive supplementary reading. 

In the discussion of adjustment the student 
is warned that she will be expected to adjust 
to a regimented, restricted environment where 
good judgment and adult behavior are ex- 
pected. She is told that nursing is hard and 
that she must be equipped physically, mentally, 
and emotionally to meet the hardship. A 
more positive approach, in which the student is 
challenged to think.of nursing as an interesting 
and worthwhile profession, would, in the long 
run, probably have a better psychological 
effect than the negative emphasis upon the im- 
portance of adjusting to conditions as they are. 


Helen Nahm 
Duke University 


Hovland, C. I., Lumsdaine, A. A., and Shef- 
field, F. D. Experiments on mass communi- 
cation. Studies in Social Psychology in 
World War II, Vol. III. Princeton: Prince- 
ton Univ. Press, 1949. Pp. viii4-345. 
$5.00. 


This is the third in a series of four volumes 
reporting the work of the Research Branch 
of the Army's Information and Education Di- 
vision during the war. Volumes I and II, 
previously reviewed in this journal (33, 609- 
611), discussed the attitude surveys relating 
to the soldier's adjustment to army life, and to 
combat and its aftermath, carried out by the 
Survey Section of the Research Branch. The 
present volume reports the work of the Experi- 
mental Section which had as its primary re- 
sponsibility experimental studies on mass 
communication. 

An introductory chapter outlines the vari- 
ables investigated and the general nature of the 
film research done by the Experimental Sec- 
tion. A basic distinction is “made between 
studies where the purpose is to evaluate a com- 
pleted product and those where the purpose is 
to investigate variables by controlled varia- 
tion" (p. 4). In terms of this distinction the 
book is organized into two sections, the first 
reporting the evaluative studies and the second 
the studies employing controlled variation. 
“In both kinds of studies the main emphasis 
was on the measurement of changes in knowl- 


140 


edge, opinion, or behavior produced by a film 
or other communication device" (p. 5). 

The first three chapters of Part I are devoted 
to the evaluation of the “Why We Fight” 
series of indoctrination films. Chapter 5 
discusses several evaluative studies involv- 
ing an experimental comparison of alternative 
methods of presenting the same material, e. g- 
a sound motion picture and a film strip. 
Chapter 6 analyzes the effects of films on men 
from different intellectual levels. 

In Part II, one chapter is devoted to a study 
of the short-time and long-time effects of an 
orientation film upon factual knowledge and 
opinions, another to a study of the effects on 
opinion change of presenting one side as com- 
pared with both sides of a controversial issue, 
and a third to the effect of audience participa- 
tion during a film strip presentation on learning 
the phonetic alphabet. Chapter 10 is an ex- 
tremely well done bit of writing which neatly 
summarizes and evaluates the work of the 
Experimental Section. Four important ap- 
pendices follow which deal with some of the 
measurement problems encountered by the 
Experimental Section in their various studies. 

The results reported, with one exception, 
will not be particularly revealing to psycholo- 
gists. We find, for example, that it is possible 
to increase factual knowledge as a result of 
being exposed to films. Some effects are ob- 
served on opinions specifically covered by the 
films, but these effects are not as great as those 
on factual knowledge. Still less is the effect 
of films on opinions of a more general nature— 
opinions which the films were designed to in- 
fluence, but which were not specifically covered 
by the content. We find also that those with 
greater intellectual ability learn more from a 
film than those with less ability; that an- 
nouncing in advance that a quiz will be given 
on the content of a film facilitates learning; 
that audience participation also facilitates 
learning. These results, and most of the 
others not mentioned here, would be predicted 
in terms of current psychological knowledge. 

The exception, mentioned above, concerns 
the prevalent notion among social psycholo- 
gists that the influence of "propaganda" is 
merely the reinforcing of existing beliefs. But 
mm a radio presentation study the Experimental 


Section found that “whether a man was 


Book Reviews 


initially for or against the stand taken in Ee 
communication, his opinion tended to be z 
fluenced in the direction of more acceptance E 
the point of view argued for in the pe TS. 
tion. In no case was there found a sign! : the 
change opposite to the intended effect o hs. 
communication among those initially p ale 
and in all cases where the audience as ka W We 
showed a positive change, the change was He 
tive among those initially opposed b ois ir 
Although the concept of “attitude ie id 
quently used in the first two oy cu 
authors of the present volume avoid «a terms 
cept, preferring to state their discussion n KE 
of “opinions.” “This usage reflects in pi dis- 
fact that no satisfactory way was founc due 
tinguishing between attitude questions a d 
tions of opinion" (p. 265). But aum 4 
opinion are distinguished from one 
facts in that “opinions are interpreta E 
available facts, which interpretation 
difficult to verify or disprove directly ©. upon 
"Informed" opinions are those been ac 
"valid" interpretations of availab É thos? 
"Valid" interpretations, in turn, `® reli 
which can be shown to be positively p^ inte 
with an index of ability to make bend educ% 
pretations" (pp. 275-276), for examp*®» 


a 


65): 


: 3 etations ^. 

tional level. “Invalid” interprete inde 
1 Ti n 

those negatively correlated with t ine 


Pies Atos, es ather intere" io 
This distinction leads to the pep dert s t 
definition of “propaganda” as “a 


a "m ions” (p. 216). nal 
foster ‘invalid’ interpretations i this ?! e 
That there are difficulties pie P" 
ysis will be obvious if it is apP nen 


r 
Would agreement or disag 


h x ente ve 
with this statement have repres ha 


formed" opinion? But which d e 
been the “valid” interpretation 0* id" in^ 
able facts? Remember that a el with w 
pretation is one which is correlate” erp" the 
index of ability to make “valid ed DY cel 
tions and that one such index im M. 

Xperimental Section was educa ie ion P 
And now ask and answer the quest aff id 


the relationship between political par ? gno" 
tion and educational level, and y M 
also have the answer to the question 

it was the Democrats or the Repu 


he amp 
engaged in propaganda "during t” 


Book Reviews 


Many of the results of the studies are re- 
ported in terms of the average per cen! correct 
answers rather than in terms of the simple 
arithmetic average. This can be somewhat 
misleading. For example, Figure 5, page 143, 
Teports the average per cent correct answers 
for a control and various experimental groups. 
Taking only the figures for the control group 
and the movie group, we find that the former 
averaged 39.5% correct answers and the movie 
Broup 46.6%, giving a difference, apparently 
the result of the movie, of 7.1%. When we 
realize that the test consisted of only 39 items, 
this difference is not as imposing as the per cent 
figures make it appear. The test mean score 
for the control group would be 15.4 and for the 
Movie group 18.2, giving a mean difference 
of 2.8. 

It is impossible to do justice to this book in a 

Nef review, but let me say that I found it 
highly provocative and stimulating. Few ex- 
Perimental studies dealing with such complex 
Situations have been as carefully planned, 
executed, and analyzed as those reported in 
this volume. The results, and particularly 
the methodology, should prove of interest to 
Psy Chologists, film producers, and anyone con- 
emed with audio-visual aids in indoctrination 
9r training programs—whether in education 
or industry. 
ms is the book concerned with just meth- 
result? and the reporting of experimental 
» S. These authors have not been afraid 
tug ulate; to hypothesize, to indulge in 
an c ce provides stimulating reading in 
tons e Where theory is largely lacking. Ques- 

,. are not only answered but raised, with 
: result that sufficient research problems are 
BEOested to occupy workers in this field for 
me time to come. 


Su 
So) 


Allen L. Edwards 
The University of Washington 


Mathewson, Robert H. Guidance policy and 
Practice, New York: Harper and Brothers, 
49. Pp, 294, $3.00. 


Te growing pains and self-examination 
Psych ne characterized the profession of 
u i ?By during the years since World War 
and € been characteristic, also, of the kindred 

Partly overlapping fields of guidance. 


141 


In the American Psychological Association, 
various divisions have been struggling with 
definitions and attempting to delimit fields, 
with both clinical and educational psycholo- 
gists feeling that guidance belongs to them 
while the guidance psychologists contend that 
they represent a distinct field. So, also, in 
the broader fields of education and social 
service, the professions of psychology, educa- 
tion, and social work have been staking claims 
to guidance, each claiming it as its own. To 
complicate matters, there has for some years 
been considerable debate among specialists in 
guidance as to just how the term is to be 
defined, just what needs it attempts to meet, 
and just what services it should provide in 
order to meet them. 

Guidance Policy and Practice is an attempt 
to define terms, clarify objectives, examine 
methods in the light of these objectives, and 
so delimit the field. It represents the thinking 
of a guidance specialist who has had long ex- 
perience as a practitioner at both adolescent 
and adult levels (under educational auspices), 
and who has, during the past few years, given 
considerable time to the training of graduate 
students. Itisata strategic time in the devel- 
opment of guidance and of professional psy- 
chology that Mathewson has put his thinking 
on record. 

In Part I Mathewson discusses fundamental 
factors such as psychological and philosophical 
concepts, individual and social needs, institu- 
tional settings, the psychology of the partic- 
ipants, and costs. Part II treats the im- 
plementation of policies in schools and in the 
community. Part III takes up current issues 
such as the scope of guidance, the responsibility 
of education, and the training and role of the 
counselor. Part IV examines the future, 
particularly the trend toward unity or toward 
a broader concept of guidance than has pre- 
vailed in some circles, educational institutions 
as the primary medium of guidance, and the 
development of sound national policy in a 
period of increasing federal activity. 

Although the book is uneven (the chapter 
on The Institutional Setting seems to deal with 
the obvious, while that on The Role of the 
Counselor is a timely and effective exposition 
of a much more mature approach to counseling 
than, for example, either Williamson's or 


142 


Rogers’), it is on the whole thoughtful, Lapse 
ting, and constructive. _This reviewer wou 
like to focus on two major contributions, and 
to point out some related weaknesses. TM 

Perhaps the strongest aspect of Mathew son's 
philosophy of guidance is his reiteration of the 
developmental nature of guidance. He sees it 
as a process of orientation of the individual to 
society and of the development of a mature, 
independent, socialized, self. This approach 
resembles that of many educators, and stands 
in clear opposition to and contrast with that 
of many guidance specialists who came to this 
field from social work, and who view guidance 
as a service for “persons in need . . . experi- 
encing some breakdown in their capacity to 
cope unaided with their own affairs,” whether 
these be the choice of a college major or re- 
habilitation for employment after injury. 
Mathewson conceives of guidance as primarily 
educational and preventive rather than re- 
medial; he concludes that schools and colleges, 
as institutions with principal responsibility for 
education and development, are the major 
guidance agencies. He recognizes that guid- 
ance also has an adjustive or remedial function, 
but one of the defects of the argument (sur- 
prising in an author who has worked at the 
adult level) is the failure to recognize the fact 
that guidance needs and services do nol cease 
at age 22. In the vocational sphere, for ex- 
ample, guidance does not actually cease with 
“early adjustment” (p. 120), but, as geronto- 
logy is increasingly demonstrating, continues 
through life, with new problems arising and 
new services called for at each of the life stages. 
In this reviewer's opinion, then, Mathewson's 
definition of the Scope and agencies of guidance 
is too limited, even though his emphasis is 
Sound. 

The second contribution sin 
ment in this review 
trend toward unity in 
the recommendations 
the National Vocation 
in the '30s and again |, 


gled out for com- 
is the discussion of the 
the field of guidance. As 
of policy committees of 
al Guidance Association 
‘ast year have shown, and 


Book Reviews 


as evidenced further by current camar 
work in the Council of Guidance and I m 
Associations, there is an increasing EAR 
of the fact that adjustment involves a nid 
personality, that the problem of ST tiis 
choice, for example, is very much a vy ich is 
problem of developing a self-concept W Tec 
in harmony with reality. Mathewson define 
nizes the unity of personality, attempts pes 
guidance in broad terms, and forecasts pe d. 
ing acceptance of this point of view. yo mis: 
this reviewer's judgment he makes b in 
takes made by many educationists eem 
the field of guidance: he fails to jijy, aid 
sufficiently the complexity of persue (perhaps 
failing to do this, errs further by veh the 
unintentionally) leaving the reader E educ 
impression that all the vocational * become 
tional counselor needs to do in patte y? 
à personal counselor is to recognize fen of Pi 
personality. To recognize the wor. g 
sonality and of guidance paia jety? 
the complexity of personality ‘ang ps ra t 
including the world of work, is to Jards. I 
amateurish work and to lower amant c " 
encourages vocational and educatio” vito 
selors to venture into psychotherapy ©. en 
"ge . Á ecialty; int? 
adequate training in that sp venture P 
courages clinical psychologists to V' g with 
vocational and educational counseling al 
sufficient knowledge of opportunit! 
quirements in schools, colleges, an< 
of employment. conis 
Despite these defects, Metti Bes. 
volume is a thoughtful and impor func? 
bution to thinking on the nat bile it y 
and organization of guidance. 
not contain the final solutions 
rent problems of definition, A 
jurisdiction, it should be read M i uve. 
interested in finding a truly cons 3 nizat 
out of our present functional and © 
dilemmas. 


Teachers College, 
Columbia University 


5 ——Á c NN 
p — — PS — M €— À À D eee 
— 
E 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 
Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota à 


Interaction process analysis. Robert F. Bales. Cam- 
bridge: Addison-Wesley Press, Inc., 1950. Pp. 203. 
$6.00. 

Giant brains or machines that think. Edmund Callis 
Berkeley, New York: John Wiley and Sons, Inc., 
1949, Pp, 255. $4.00. 

Problems in Personnel administration. Richard P. Cal- 
hoon. New York: Harper and Brothers, 1949. Pp. 
540. $4.00. 

Immediate and retention effects of interpolated rest periods 
on learning performance, Bertram Epstein. New 
York: Bureau of Publications, Teachers College, 

n Columbia University, 1949. Pp. 77. $2.10. 

4 Cholarships, fellowships and loans. S. Norman Fein- 
gold. Boston: Bellman Publishing Co., Inc., 1949. 

a 256. $6.00. 
te expanding role of government and labor in the Ameri- 
can economy. Waldo E. Fisher. Bulletin No. 18. 
Daden: Industrial Relations Section, California 

s, Stute of Technology, 1950. Pp. 26. $1.00. 
“vey of selected personnel practices in Los Angeles 
County as of April 1, 1949. Bulletin No. 17. Robert 

"Gray etal, Pasadena: Industrial Relations Sec- 


tion, California Institute of Technology, 1949. Pp. 
S 82:50. 
po Personnel practices of large employers in Los 


ngeles County as of April 1, 1949. Circular No. 18. 
poem D. Gray et al. Pasadena: Industrial Rela- 
1949 Section, California Institute of Technology, 
4219. Pp.12, S100. 
sonality, development and assessment. Charles M. 
arsh and H, G. Schrickel. New York: The Ronald 
Em Co., 1950. Pp.518. $5.00. : 
New ogy in everyday living. Ralph Leslie Johns. 
$3.30, 19k: Harper and Brothers, 1950. Pp. 464. 


'tclical job evaluation. 
ohn Wiley 
*Ycholopy 
user, 
tons R; 


Philip W. Jones. New York: 
and Sons, Inc., 1948. Pp. 304. $4.00. 
of labor-management relations. Arthur Korn- 
Editor. Champaign, Ill.: Industrial Rela- 
esearch Association, 1949. Pp. 122. $1.50. 


143 


Practical and theoretical aspects of psychoanalysis. Law- 
rence S. Kubie. New York: International Universi- 
ties Press, 1950. Pp. 252. $4.00. 

Rehearsal for destruction. Paul Massing. New York: 
Harper and Brothers, 1949. Pp. 341. $4.00. 

Patterns of panic. Joost A. M. Meerloo. New York: 
International Universities Press, 1950. Pp. 120. 
$2.00. 

Educational acceleration, appraisals and basic problems. 
Sidney L. Pressey. Columbus: Bureau of Educa- 
tional Research, Ohio State University, 1949, Pp. 
153. $2.50, paper; $3.00, cloth. 

Problems of infancy and childhood. Milton J. E. Senn, 
Editor. New York: Josiah Macy, Jr. Foundation, 
1949. Pp. 156. 

Varieties of delinquent youth. William H. Sheldon, 
Emil M. Hartl and Eugene McDermott. New York: 
Harper and Brothers, 1949. Pp. 899. $8.00. 

The psychologist in industry. M. E. Steiner. Spring- 
field, Ill.: Charles C. Thomas, Publisher, 1950, Pp. 


107. $2.00. 
Personnel management for supervisors. Claude E. 
Thompson. New York: Prentice-Hall, Inc., 1949, 
Pp. 192. $2.95. 


Human relations in modern industry. R. F. Tredgold. 
New York: International Universities Press, Inc., 
1950. $2.50. 

Achieving maturity. Jane Warters. New York: Mc- 
Graw-Hill Book Co., Inc., 1949. Pp. 349. $3.00. 
Counseling and discipline. E. G. Williamson and J. D. 
Foley. New York: McGraw-Hill Book Co., Inc., 

1949. Pp.387. $3.75. 

Com prehensive examinations in a program of general edu- 
cation. Board of Examiners, Michigan State College. 
East Lansing: Michigan State College Press, 1950, 
Pp. 165. $4.00. 

Children absent from school. New York: Citizens’ Com- 
mittee on Children of New York City, Inc., 1949, 
Pp. 116. $1.00. 

Advisory service for students of advertising. Westport, 
Conn.: Advisory Service, 1949. Pp. 80. $1.25. 


1949 DIRECTORY 


AMERICAN PSYCHOLOGICAL ASSOCIATION 


1515 MASSACHUSETTS AVENUE N. W. 
WASHINGTON 5, D. C. 


In the alphabetical list of 6735 members, the 1949 Direetory of the 
Association gives the names of the members, their addresses, their 
present positions, their last degrees, and their class of membership. 
Membership lists for the Divisions of the Association, the lists of 
Diplomates in the fields of clinieal, industrial, and counseling of the 
Ameriean Board of Examiners in Professional Psychology, the 
By-Laws, and a geographieal and institutional index of members 


are included. The editor is Helen M. Wolfle of the Association 
staff. 250 pages, $2.00. 


A 


SAMPLE ENTRIES 


Humphreys, Lloyd G. School of Hunt, Mary Louise 


Education, Stanford Univ, Stan- 
ford, Calif. Assoc. prof. educ. and 
psych. PhD 38. A 5; F 9, 19. 


Hunsicker, Mr. Albert L. Com- 
mittee on Human Development, 
Univ. of Chicago, Chicago, 37, Ill. 
Stud. MA 39. A, 


Hunt, Dr. Howard F, Dept. Psyeh, 
Univ. of Chicago 37, Ill. Assoc. 
prof. PhD 43. F 19, 


Hunt, Dr. J. Mev. Institute of 
Welfare Research, Community 
Service Society, 105 East 22nd St, 
New York 10, N. Y. Dir. PhD 
33. DipLOL F 3, 8,9 12, 


1252 Talbert 
St. S. E, Washington, D. C. A 48. 

Hunt, Dr. Thelma Dept. Psych, 
George Wasbington Univ, Wash- 
ington 6, D. C, F 5, 12. 

Hunt, Frof, William A. Dept. 
Psych, Northwestern Univ, Evans- 
ton, Ill Pog, psych, and biol. 
PhD 31, Dip-Cl F 2, 3, 8, 9, 12, 
19, 20. 

Hunt, Mr, Wilson L. Boston 
State Hosp, 591 Morton St, Dor- 
chester Center 24, Mass. Clin. 
psych’t. AM 47, A 49, 

Hunter, Dr, Elwood C, Dept. Edu- 
cation, Tulan 
15, La. 


ww 


Journal of Applied Psychology 


VoL. 34, No. 3 


JUNE, 1950 


The Administrative Judgment Test * 


Milton M. Mandell 


United States Civil Service Commission, Washington, D. C. 


Ps development of valid methods for selec- 
8 persons to carry the primary responsibility 
a directing our large corporations and govern- 
ment agencies is still a major task in the field 
of personnel administration. 
qol justification for extensive study of 
T question is found in the fact that analysis 
th Nsuccessful organizations so often indicates 
Jat the lack of success is due to lack of admin- 
'strative ability. 
P asd inducement may be found in the 
ius eA of the late Professor John G. Jenkins 
take psychologists should be willing to under- 
m UM in areas of great significance even 
»*. 1 the experimental conditions are not so 
Se as would be desirable. 
Tei ed Psychologists have recognized the 
mes importance of the field of admin- 
exkl] ^ Selection. The broad-gauge work of 
No Shartle and his group at Ohio State 
ion "d in the general field of administra- 
Pro dies some attention to selection, should 
Valuabi. undamental facts that will be of in- 
of exec e aid to those interested in the selection 
bs a and administrators. 
remains t venture the opinion that while much 
ready so o be known in this field, there are al- 
as GE established facts which can be used 
reasonable hypotheses. For example, it is 
mmon d hypothesize that there are both 
ni e ements and special elements in all 
Strative jobs. All positions which can 


p 


admi 
* 


Ot 
Unite, Pet ports. on the experimental work of the 
M ive fi d Civil Service Commission in the admin- 
andell, M d are available in the following articles: 
Personne stag y3, Tbe selection and promction of a 
wis, M, ant Personnel, 1948, 25, 125-127; Mandell, 
vitte test, Adkins, Dorothy C., The validity of 
doy 5b E do for the selection of administrative per- 
l, M, r6; Psychol. Measmt., 1946, 6, 293-313; Man- 
: Bap pee for administrative and supervisory 
+ “ersonnel Rev., 1948, 9, 190-193. 


145 


properly be designated as executive or admin- 
istrative have certain common elements; how- 
ever, there are also such special elements as 
the amount of verbal ability required, the 
amount of persuasive skill required, the tempo 
of operations required, and other factors. Any 
good selection programs for these positions will 
take into account these common and special 
factors. 

It will be noted that there is no attempt here 
to define precisely administrative positions or 
executive positions. Job analyses indicate 
that program planning and coordination are 
essential characteristics of administrative and 
executive positions. Therefore, the following 
working definition is offered: An executive job 
or administrative job is one in which more than 
50 per cent of the time is devoted to program 
planning and coordination. This does not, 
obviously, include all the elements of the 
administrative job but it does include two 
elements which will probably always be found. 

The purpose of this brief report is to describe 
the administrative-judgment test which has 
been developed by the United States Civil 
Service Commission as part of its program of 
research in the field of administrative selection. 
The reason for describing this one test is that 
it is the test which has given the most con- 
sistently satisfactory results and the one on 
which the most experimentation has been done. 
While the data below are based on only 171 
cases, they represent four different samples and 
two different types of criteria. The relative 
consistency of the results among these different 
samples and the use of different criteria offer a 
basis for the belief that the test is measuring 
elements which are essential to administrative 
success. 

The Test. Theadministrative-judgment test 


146 Milton M. 
is in 5-choice form. In the housing studies 
referred to below, 100 items were used; in the 
two other studies, 80 items were used. The 
test attempts to measure broad understanding 
of the processes of administration. It attempts 
to measure the understanding of the adminis- 
trative problems of large organizations, whether 
government or private. The questions at- 
tempt to measure the common elements in the 
administrative process. They include prob- 
lems in the relationships between the head- 
quarters and field offices in an organization, and 
those between research and operating person- 
nel. They also include problems on the timing 
of programs and the organization of the office 
of an administrator. The test does not at- 
tempt to measure technical knowledge in such 
fields as personnel or budgeting or accounting. 
It attempts, as far as possible, to divorce its 
contents from complete dependence on basic 
training, and attempts to emphasize problems 
which can be evaluated on the basis of observa- 
tion and experience or training. The split- 
half reliability of the administrative-judgment 
test is .94 based on the group of 258 cases on 
which scores were available. Below isa sample 
item from this test: 


Which one of the following administrative 
situations or problems will most probably 
occur when direct relations are permitted 
between a staff specialist employed by the 
national office of an organization and the 
operating officials employed in the field 
offices? 

A) decrease in the feeling of responsi- 

bility of national office specialists for 

the operations of state programs in 
their specialties 
inadequate technical supervision of 
field office operations 
inadequate knowledge in the national 
office of the competence and qualifi- 
cations of field office personnel 


difficulty in keeping the relations on 
an advisory basis 


subordination of professional consid- 


erations to general administrative 
responsibilities 


B) 
C) 


D) 
E) 


Criteria Used. Two basic ty 
have been used in the ex 
test. The first type, job 
sents the collective ratin, 
superiors. In all these 
job performance rating: 


pes of criteria 
periments with this 
performance, repre- 
gs of colleagues and 
cases except one, the 
S are a composite of 


Mandell 


graphic ratings and paired comparisons; In 
rating one group of 20 line administrators, only 
graphic ratings were used. In addition to job- 
performance ratings, the position grade or 
salary of the subject was also used as a crite- 
rion. An average of more than four inde- 
pendent ratings for each subject in the study 
was obtained. Because of the small samples 
involved, practically no cases have been elimi- 
nated from the samples because of disagree- 
ment among raters. No cases were eliminated 
from the housing and Veterans Administration 
studies for this reason. In the case of the 
Navy study, approximately half of the cases 
were eliminated because of substantial dis- 
agreement among the raters. This greater 
divergence in the Navy Department as Mer 
pared with the other groups included in Re 
study can possibly be explained on the lee 
of the greater size of the Navy Departman 
and the greater number of bureaus that th 
subjects worked in. The fact that 6 of pu. 
correlations represent no elimination of ye 
lends further weight to the data that xd 
presented. In addition, the scoring key e 
was determined in advance; in other words, 1 
key that was used is not based upon item an. 
sis on a particular group. In all cases Pea" 
product-moment correlations were used. yere 

Population. The first group studied M 1 
63 persons in personnel, budget, and organ! 
tion analysis work in two major h 
agencies in the Federal government. 
are receiving salaries of between $3, i 
$9,400. The group at the Veterans Adm d 
tration consisted of 42 persons in the peewee! 
office of that agency receiving salaries be M 
$3,000 and $6,000. These persons wer phe 
gaged in all types of personnel work. yerl 
Navy group represents persons in $ ed i! 
bureaus of the Navy Department enga and 
various phases of personnel, office mëtho pe 
organization analysis work with salaT^ go 
tween $3,000 and $8,400. The last grows 20 
from the housing agencies, consiste» gn 
persons, with salaries of between $7» «tet 
$10,000, who are responsible for admin" gj 
major segments of the Government's 
program. for 

In addition to the data obtained oe 
administrative-judgment test, validity one 
Cents are presented for tests in the 


Ss 


T» 


The Administrative Judgment Test 


field of mental ability. For the housing groups, 
the test involved was the American Council 
9n Education Psychological examination for 
College freshmen. The total score on the test 
was used as the predicter. For the other 
ee the test consisted of 25 vocabulary 
= P m multiple-choice form. The reliability 
45 .* Vocabulary test is .84 for a sample of 
Pa Cases with a standard deviation of 5.6, 
mg Kuder-Richardson Formula No. 21. 
taba in presenting these data for men- 
tive : dli is to demonstrate that the rela- 
n ahdities for this test are substantially 
jud d In general, than for the administrative- 
one m test, despite the fact that the inter- 
eue E between the administrative-judg- 
tion Pe -and the American Council on Educa- 
Ce As +.69 while the intercorrelation be- 
leor 5 administrative-judgment test and 
ion Sh ulary test is +.59. A hasty inspec- 
M hese intercorrelations would lead to the 
nat the mental-ability and administra- 


Table 1 


Y 
yet Moment Correlations for the Administrative- 
Udgment and Mental-Ability Test with Job 
Performance and Grade Criteria 


Test of 

Adminis- Test 
trative E i 

A; Judg- Menta 
1 ecd N Criterion ment Ability 
'"9using — 63 Job Performance — 4-.49 4.30 
2. VA 63 Grade +.56 +.38 
42 Job Performance +.50 +.52 

3. Na 42 Grade +.52 +.26 
wy 22 Job Performance +4.51 4.13 

4 Hoye, 46 Grade +.28 4.21 
using 20 Job Performance — 4-.68 4.64 


147 


tive-judgment tests are measuring the same 
factors; actually, the validity coefficients belie 
this conclusion. The data would indicate that 
the only proper conclusion is that it is not 
necessary to use a mental-ability test along 
with the administrative-judgment test because 
the multiple correlation of these two tests 
would not be sufficiently greater to justify the 
addition of the mental-ability test. 

Additional Test Group. The latest trial of 
the administrative judgment test involved its 
administration to 30 persons being trained for 
line and staff administrative positions in the 
State Department. These persons receive 
salaries of approximately $3,000 a year. The 
criterion was the collective opinion of the 
supervisors for whom they have worked dur- 
ing their period of internship. The Pearson 
product-moment correlation for the adminis- 
trative judgment test with this criterion was 
+.60. The validity coefficient, using the same 
criterion, for the vocabulary test referred to 
above was +.23. 

Summary 


The median validity coefficient for the ad- 
ministrative-judgment test is +.51; the median 
validity coefficient for the mental-ability test 
is +.30. Six of the seven coefficients for the 
administrative-judgment test are significant 
at the 1 per cent or 5 per cent level of confidence 
while the significance level of the coefficients 
for the mental-ability tests are in general much 
lower. 

These data are offered as a basis for further 
experimentation in other situations in order 
to determine the value of the administrative- 
judgment type of test for executive positions. 


Received September 9, 1940, 


Menstruation and Industrial Efficiency. II. Quality and 
Quantity of Production 


Anthony J. Smith 


University of Kansas 


'The present paper continues the report of an 
investigation of the relationships between the 
various phases of the menstrual cycle and 
certain measures of industrial efficiency. In 
all, four criteria of industrial efficiency were 
studied. Ina previous paper (5) the relations 
between menstrual function and absence rate 
and activity level were examined. 

In view of the common contention that 
accuracy of performance deteriorates during 
the premenstrual and menstrual phases, it 
would seem to be desirable to investigate this 
hypothesis in an industrial situation. To the 
knowledge of the author, such a study had not 
been undertaken previously. 

Furthermore, it is frequently assumed that 
quantity of production decreases during certain 
phases of the menstrualcycle. Several reports 
of experimental studies are available on this 
point, but the results are not consistent. 
Kirihara (3) and Meeker (7) report detrimental 
effects related to the menstrual phase, whereas 
Nowikowa (2), Gorkin and Brandis (2), and 
Anderson (1) report no differential effects. 


The area is obviously in need of further 
investigation. 


Procedure and Analytic Techniques 


Aircraft Factory. Twenty-nine women em- 
ployed in the electrical department of an air- 
craft factory were studied over a period of 
forty-one days. All of their work underwent 
à routine inspection and complete records were 
kept including the employee's clock number 
the unit number, the date, and the defects 
discovered. In order to obtain data on men- 
Strual function, these women were assembled 


1 The women who ser j 
1 2 ved as sub in thi 
following factories Were, with the ix cone 
Same women described in the first paper (5) sels the 


148 


women agreed to participate and later sub- 
mitted daily menstrual data to a member 9 
the physical education department. dé 

In analyzing the data, the menstrual Cy 
was broken down into a five day premenstra? 
phase, the menstrual phase (period of flow), 2 
seven day postmenstrual phase, and an inte 
menstrual phase. 

The effect of the menstrual cycle U E: 
quality of production was determined by E 
puting the number of error days (during W? 
at least one piece was rejected) and the x5 
of errorless days for all women for each es bo 
four phases of the cycle. The data were nifi- 
recorded in a two-by-four table and the sig è 
cance of the variations tested by means Janon- 
chi-square test. The simpler menstrua 
menstrual analysis was also performed. asus? 

An attempt was made to derive à siti 
of quality of production that was more wo f 
than that involving the mere presen“ cr 
absence of defective units. Howeveb ‘i 
several conferences with the supervisory ge 
it was apparent that there could be wn on * 
ment that would permit one to Toe differ 
quality continuum inspection records ror 
with respect to kinds and numbers 9 
Hence, the cruder measure was retaine 5 oW 

Parachute Factory. The final M ^ 
investigated was quantity of pee w 
group of employees in a parachute fac : 
contacted by the company nurse 27 jp e* 
Woman was given an "explanation jv 
periment similar to that previously € ate Oy 
Forty-six women offered to partiCIP^, ia 0 
Supplied the nurse with the necessary is c 
menstrual function. The women J i 
pany were paid on a piece rate syster s 
result that the personnel departm^ qu 
tained accurate individual records } 
daily production, kind of work, 2” 
of hours worked. ord p. 

For various reasons production €^, 
not available for five of these subje“ 


pon 


p——— —— 


Menstruation and Industrial Efficiency. II 149 


sequently, the analysis was based on data from 
the remaining forty-one women. 

Examination of individual records disclosed 
that the number of hours worked per day 
Varied from person to person, from shift to 
Shift, and also for the same individual. This 
made it necessary to determine for each day 
the average hourly rate of production for each 
individual performing her usual work. 

_It became evident that differences in effi- 
ciency among the various women and differ- 
ences in the types of units produced on the 
diverse jobs would make the analysis of these 
Taw scores (hourly production rate) undesir- 
able, because variations at the many ability 
evels and among jobs would then: have un- 
equal effects. To meet this difficulty com- 
Parable derived scores had to be computed. 

At this point in the analysis, each individual 
Was treated separately. The individual's aver- 
age hourly rates through the total interval in- 
Vestigated were examined. A measure of 
Variability was obtained (standard deviation) 
and each single hourly rate was replaced by a 
pandard score. Once these standard scores 

ad been derived, individual consideration was 
Iscontinued., 

"s first analysis involved those forty-one 

a for whom production records were 

$c able, . A distribution. of. all standard 
ores falling within the first intermenstrual 


i *.80 

x 

o 

o 

n 

ats 

[3 

<e 

a 

E 

Ld 

$ o 

z 

9 

ee 

o .40 

2 

a 

o 

& 
7.80. 

z 

a 

w 

z 


period was made and the mean and standard 
deviation were computed. Similar distribu- 
tions were made for the subsequent phases 
(premenstrual, menstrual, postmenstrual, in- 
termenstrual, etc.). These means were then 
graphed with average standard scores for each 
phase being plotted along the ordinate while 
time measurements were recorded along the 
abscissa. (As an example, see Figure 1.) 

The matter of plotting the various elements 
of the cycles along the time axis posed a further 
problem. These phases could not be plot- 
ted at equal intervals along the axis for 
they did not represent equal periods of time. 

Consequently, each phase was represented 
by a distance that was in the same proportion 
to the total distance along the axis that the 
number of days in the given phase was to the 
total number of days studied. When the mean 
production in standard units was plotted for 
each phase of the cycle, the points were entered 
at the midpoints of their corresponding inter- 
vals along the abscissa. 

It will also be seen that there is an increase 
in mean production in succeeding time periods. 
This would indicate that a learning factor was 
operating, as might be anticipated. Improve- 
ment, whether it be small or great, would be 
reflected by an analysis involving standard 
scores. Furthermore, even assuming that 
some of the subjects have reached a plateau 


OBTAINED CURVE 
— — — — FITTED CURVE 


INTER. 


PHASES OF 
Fre, 4, 


SUCGESSIVE 


MENSTRUAL 


CYCLES 


Mean rates of production during successive menstrual phases. Parachute factory: Ages 29-38, 


Menstruation and Industrial Efficiency. II. Quality and 
Quantity of Production 


Anthony J. Smith 


University of Kansas 


'The present paper continues the report of an 
investigation of the relationships between the 
various phases of the menstrual cycle and 
certain measures of industrial efficiency. In 
all, four criteria of industrial efficiency were 
studied. In a previous paper (5) the relations 
between menstrual function and absence rate 
and activity level were examined. 

In view of the common contention that 
accuracy of performance deteriorates during 
the premenstrual and menstrual phases, it 
would seem to be desirable to investigate this 
hypothesis in an industrial situation. To the 
knowledge of the author, such a study had not 
been undertaken previously. 

Furthermore, it is frequently assumed that 
quantity of production decreases during certain 
phases of the menstrualcycle. Several reports 
of experimental studies are available on this 
point, but the results are not consistent. 
Kirihara (3) and Meeker (7) report detrimental 
effects related to the menstrual phase, whereas 
Nowikowa (2), Gorkin and Brandis (2), and 
Anderson (1) report no differential effects. 


The area is obviously in need of further 
investigation. 


Procedure and Analytic Techniques 


Aircraft Factory. Twenty-nine women em- 
ployed in the electrical department of an air- 
craft factory were studied over a period of 
forty-one days. All of their work underwent 
a routine inspection and complete records were 
kept including the employee’s clock number. 
the unit number, the date, and the defects 
discovered. In order to obtain data on men- 


strual function, these Women were assembled 


! The women who se í 

i % tved as subjects in thi 
following factories were, with the tee ia and the 
Same women described in the first paper (5). noted, the 


148 


women agreed to participate and later sub- 
mitted daily menstrual data to a member 0 
the physical education department. 1 gdé 
In analyzing the data, the menstrua E 
was broken down into a five day premenstr! 4 
phase, the menstrual phase (period of flow) ni 
seven day postmenstrual phase, and an m 
menstrual phase. on 
The effect of the menstrual cycle p. 
quality of production was determined by ici 
puting the number of error days (during V 
at least one piece was rejected) and the wi 
of errorless days for all women for each » (hen 
four phases of the cycle. The data were ifi 
recorded in a two-by-four table and the S£ he 
cance of the variations tested by means in 
chi-square test. The simpler menstrua 
menstrual analysis was also performed. easure 
An attempt was made to derive à sensit 
of quality of production that was more € of 
than that involving the mere preset afte! 
absence of defective units. Howeve™ ff 
several conferences with the supervisory gre 
it was apparent that there could be ^ K 
ment that would permit one to seine 
quality continuum inspection records rro” 
with respect to kinds and numbers © 
Hence, the cruder measure was re oV 
Parachute Factory. The final porti A 
investigated was quantity of produc ry “i 
group of employees in a parachute pe d v. 
contacted by the company nurse i the E 
woman was given an “explanation ive 
periment similar to that previously sg iM 
Forty-six women offered to partioP ta 0 
supplied the nurse with the necess4™ o 
menstrual function. The women a^ Y 
pany were paid on a piece rate syst t jn 
result that the personnel depart" includ o 
tained accurate individual records qo” 
daily production, kind of work, a? 
of hours worked. . 
For various reasons production 7 cts- 
not available for five of these subJ 


esc? 


record 


Menstruation and Industrial Efficiency. II 


Sequently, the analysis was based on data from 
the remaining forty-one women. 

Examination of individual records disclosed 
that the number of hours worked per day 
varied from person to person, from shift to 
shift, and also for the same individual. This 
made it necessary to determine for each day 
the average hourly rate of production for each 
individual performing her usual work. 

_It became evident that differences in eff- 
clency among the various women and differ- 
ences in the types of units produced on the 
diverse jobs would make the analysis of these 
raw scores (hourly production rate) undesir- 
able, because variations at the many ability 
levels and among jobs would then: have un- 
equal effects. To meet this difficulty com- 
Parable derived scores had to be computed. 

At this point in the analysis, cach individual 
Was treated separately. The individual’s aver- 
age hourly rates through the total interval in- 
vestigated were examined. A measure of 
Variability was obtained (standard deviation) 
and each single hourly rate was replaced by a 
pandard Score. Once these standard scores 

ad been derived, individual consideration was 

IScontinued, 
au he first analysis involved those forty-one 
P yi for whom production records were 
ae able. A distribution of all standard 

ores falling within the first intermenstrual 


149 


period was made and the mean and standard 
deviation were computed. Similar distribu- 
tions were made for the subsequent phases 
(premenstrual, menstrual, postmenstrual, in- 
termenstrual, etc.). These means were then 
graphed with average standard scores for each 
phase being plotted along the ordinate while 
time measurements were recorded along the 
abscissa. (As an example, see Figure 1.) 

The matter of plotting the various elements 
of the cycles along the time axis posed a further 
problem. These phases could not be plot- 
ted at equal intervals along the axis for 
they did not represent equal periods of time. 

Consequently, each phase was represented 
by a distance that was in the same proportion 
to the total distance along the axis that the 
number of days in the given phase was to the 
total number of days studied. When the mean 
production in standard units was plotted for 
each phase of the cycle, the points were entered 
at the midpoints of their corresponding inter- 
vals along the abscissa. 

It will also be seen that there is an increase 
in mean production in succeeding time periods. 
This would indicate that a learning factor was 
operating, as might be anticipated. Improve- 
ment, whether it be small or great, would be 
reflected by an analysis involving standard 
scores. Furthermore, even assuming that 
some of the subjects have reached a plateau 


OBTAINED CURVE 
— — — — FITTED CURVE 


0 *.80 
x 
o 
o 
v 
ats 
[3 
a 
a 
z 
= 
5-0 
L4 
o 
ee 
57.40 
2 
a 
o 
ac 
a 
-.80 
z 
ü 
z 
PHASES OF SUCGESSIVE 
Fi, 1, 


INTER. 


MENSTRUAL 


CYCLES 


Mean rates of production during successive menstrual phases. Parachute factory: Ages 29-38. 


150 


while the remainder are improving to some 
extent, the composite curve would exhibit this 
M Traning factor had to be eliminated 
before other variations in performance could 
be studied. Ideally, this would be effected by 
fitting a theoretical learning curve to the data 
and, thereafter, considering only variations 
about the curve. However, in this case no 
theoretical learning curve was available and it 
was necessary to fit a smooth curve to the data 
that would make the deviations about it a 
minimum. The curves throughout this study 
were fitted by inspection. 

From this point on, the fitted curve was 
treated as being equivalent to a zero line and 
the positive and negative deviations of the 
points about it were computed. These values 
were the new mean production measures in- 
dependent of learning, expressed in terms of 
standard scores. 

This was followed by a combination from 
similar periods of these new scores. A com- 
posite intermenstrual mean was derived by 
combining the means from the three intermen- 
strual periods studied. Similar composite 
means were obtained for the other components 
of the menstrual cycle. Following this, com- 
posite variances for the four phases were com- 
puted and were then tested for homogeneity 
(4). With these four composite means and 
four composite variances available, it was 
possible to make analyses of the differential 
effects of the various parts of the menstrual 
cycle by means of an analysis of variance 
technique for each situation in which the condi- 
tion of homogeneity of variances was realized. 
In each instance, the statistic used was 
Snedecor's F (6). 

This general approach was utilized in each of 
the thirteen remaining analyses of variations in 
production. The thirteen conditions ana- 
lyzed include those to be found in Table 2, as 
well as the first shift and average mental 
difficulty groups. 

Garment Factory. Quantity of production 
was also investigated at a local garment factory. 
Data on rate of production were collected by 
the business manager of the union, who re- 
quested that the employees provide her with 
records of their earnings, presumably for use 
by the union in subsequent discussions with 


Anthony J. Smith 


management. This approach could concelv- 
ably have had some effect on earning rate 
through the factor of suggestion. een 
there was no reason to expect a differentia 
menstrual effect. Special forms were prepared 
covering weekly periods but requesting daily 
information on number of hours worked, dozens 
of units completed, and ticket (daily) earnings 

At approximately the same time, peu 
was granted by the union to approach indt 
vidual members and solicit their participatio? 
in a research project. Immediately there- 
after, a small group of sixteen women wa 
selected to act as potential subjects. ku 
women were contacted by the author's ae 
during their lunch hour and informed nk 
the apparent "purpose" of the experime 
They then agreed to provide the informat! 
requested. . 

Tt was decided at a conference involving e 
personnel manager, the union business age?” 


i ; vere the 
and the author that daily earnings were he 
best available measures of efficiency. ad) 


number of units completed was unsatisía A 
because the women worked at diferen a 
requiring varying amounts of time for di sed 
units. Earning rates for each of the M bor" 
work had been determined by joint ? 
management study and consequently Te oye 
ability more adequately. These emp oa k 
worked on a piece rate basis. Their " 
was inspected and they were required 16: 
work all rejected units. As a result, 
earnings were measures of quality and qu 
of production. in rat? 
The analysis of the variations in earnil 8 that 
was carried out in a manner similar vera? 
employed at the parachute factory. pes? 
hourly earning rates were computed 27 ach P 
were transformed to standard scores sanc?) 
terms of the subjects own perform f al 
Distributions of the standard Ay e. 
subjects during each of the phases of t x 
strual cycle were obtained. The Lane 
for each phase were then delen as Pa 
plotted. A curve was fitted to these » dev 
eliminate the effects of learning. vere com 
tions of the points from this curve " n s? 
puted and were treated as the new mined 
Means for similar phases were -— n 
were the variances and the variances "aly? 
tested for homogeneity. Finally; án 


anti 


Menstruation and Industrial Efficiency. IT 151 
Table 1 
Quality of Production: Frequency of Occurrence of Error and Non-Error Days During the 
Phases of the Menstrual Cycle 
Premenstrual Menstrual Postmenstrual Intermenstrual 
Error Day 18 17 19 48 
(18.78)* (20.16) (24.41) (38.69) 
Non-error Day 132 144 176 261 
(131.22) (140.84) (170.59) (270.31) 


* è E 
Numbers in parentheses refer to theoretical frequencies. 


of i " 
a variance was performed on the derived 
"posite means and composite variances. 


Results and Discussion 


M eua of Production. The analysis of the 
cord Phase data obtained from inspection re- 
DES (see "Table 1) revealed differences that 
ema] not Significant (P equals .20). The gross 
Ne contrasting menstrual with nonmen- 
twee days yielded much higher agreement be- 
edial, theoretical and obtained frequencies (P 
Sire S .60). The trend was in the direction of 
T T error days during the menstrual period: 
ear felt that some of the women might 
te est an obvious detrimental effect during 
tag Co phase and that such an effect 
Süel e obscured by group treatment. Al- 
abs £ H individual analyses could not be under- 
vidual Satisfactorily, an examination of indi- 
Seat oe disclosed no tendency toward a 
strug Proportion of error days in the men- 
terva] "E than during the nonmenstrual in- 
uring th hose persons who performed poorly 
foore * period of flow tended to perform 

The at all times. 
in the failure to discover a significant decrease 
Strual Los of production during the men- 
the Pt ase might conceivably be explained on 
ex ands that accuracy was maintained at 
Sideratio, nse of lowered production. A con- 
Parachy és of the results derived from the 
to ren € and garment factories would seem 

*r this interpretation untenable. 


js 
ed of Production. In checking the 
"tes q; neity of the variances of production 


Cycle uring the several phases of the menstrual 
in on}. Significant differences were encountered 
the Y {vo instances, On the first shift, while 

erences were significant, variability 


was highest during the postmenstrual period 
and lowest during the premenstrual period. 
Women working on jobs described as being of 
average mental difficulty also revealed signifi- 
cant differences in variability, with variability 
being greatest during the intermenstrual phase 
and lowest during the premenstrual phase. 
This decreased variability was not accompanied 
by a decrease in production. Obviously, no 
detrimental menstrual effect is indicated. 

In the analysis of the production rates of the 
remaining testable groups, two yielded sig- 


Table 2 


Rate of Production: Values of F and Corresponding 
Probability Values 


Number 
of Days 
Analyzed F Probability 
Parachute Factory 
All Subjects 1711 1.67 19 
Shift 
Second 553 1.88 AS 
Third 567 1.09* >.20* 
Age 
18-28 438 2.76* >.20* 
29-38 576 1.82* >.20* 
39-50 697 1.18 >.20 
Mental Difficulty 
Simple 252 2.24 10 
Difficult 7 3.55 .02 
Physical Difficulty 
Average 1431 2.03* >.20* 
Strenuous 280 S17 03 
Standing 707 1.16 >.20 
Sitting 1004 1.18 > 20 
Garment Factory 
All Subjects 457 4.66* a5 


* Variance among groups is smaller than variance 
within groups. 


152 


nificant differences (see Table 2). Women en- 
gaged in jobs involving a relatively high level 
of mental difficulty manifested significant dif- 
ferences in production rate. However, their 
lowest production occurred during the premen- 
strual phase, with highest production appearing 
in the menstrual phase and being of such a level 
as to counteract the premenstrual loss. The 
second group of women working on jobs de- 
manding strenuous physical activity displayed 
high postmenstrual production and low inter- 
menstrual production. 

In brief, variability in production on the 
jobs studied in this investigation does not 
increase during the premenstrual or menstrual 
phases as has been claimed. If anything, it 
would appear to decrease. Furthermore, rate 
of production does not seem to decrease during 
the period of flow or during a period of possible 
“premenstrual tension,” with the exception of 
the women performing work of a high level 
of mental difficulty. As a matter of fact, 
there is evidence that a drop in production 
occurs in the intermenstruum in one instance. 
It may be that both of these “significant” dif- 
ferences could have occurred by chance. How- 
ever, in the groups not yielding significant 
differences the intermenstrual period is often 
characterized by low production, whereas the 
postmenstrual period is usually one of high 
production. 


Summary 


This second part of the investigation was 
undertaken to determine the effects of the 
various phases of the menstrual cycle upon 
industrial efficiency as reflected in quality and 
rate of production. 


A total of eighty-six women in the aircraft 
and garment industries served as subjects in 
the three component studies. 
thirty-eight hundred individua 
were ultimately analy 
study. 

Tn each possible instance, tw 


o analyses were 
performed. In the gtoss analysis, menstrual 


performance was contrasted with nonmenstrual 
performance. The more intensive analysis was 
concerned with a comparison of performances 
during the premenstr 


ual, menstrual ostmen- 
Strual and intermenstrual phases, i i 


Approximately 
l working days 
zed in the complete 


Anthony J. Smith 


In the study of the employees of the paras 
chute factory, the subjects besides being ex- 
amined as a group were classified into 
groups according to shift, mental difficulty o! 
work, physical difficulty of work, age, eld 
standing vs. sitting jobs to determine poss! 
differential effects of the cycle. d 

The study of quality of production, measure i 
in terms of the presence or absence of e 
days (days on which defective units ur 
worked), shows that variations are genera y 
small and unrelated to the component phases ol 
the menstrual cycle. 

Significant ow in variability of PTO 
duction rate occur in two of the subgroups: 
one case, greatest variability is found in 
postmenstrual period, while in the other, 
occurs in the intermenstrual period. e 

All of the analyses of production rate y 
two reveal no statistically significant di di 
ences among the phases of the menstrual C^. i 
In one group intermenstrual production € a 
while the second group displays low Pre nen- 
strual production which is offset by high 


the 
it 


other phase. esu 
occur they would appear to be the T 3 ual 
situational determinants rather than me™ 
function. 


Received Seplember 20, 1940. 


"x 


References 


ut 

1. Anderson, Mary. Some health aspects d Wel 
women to work in war industries, yun atio 
annual meeting, Industrial Hygiene PE v 
America, Inc., Nov. 10-11, 1942, 16 su" 

- Gorkin, S., and Brandis, S. Einfluss det ton, 
tion auf einige psychophysiologische C n 
und auf Arbeitsfühigkeit der PT? 
physiologie, 1936, 9, No. 3. , e, 

. Kirihara, H. Functional periodicity, 2, No hi 
Sci. of Labour, Kurasiki, Japan, 1955. statis 


[S 
a 


[^ 


4. Rider, P. R. An introduction to mote i 
methods. New York: John Wiley, ciet i 
5. Smith, A. J. Menstruation and indus" J 
I. Absenteeism and activity lev call” 
Psychol., 1950, 34, 1-5. et am 
6. Snedecor, G. W. Statistical methods- gol 
giate Press, Inc., 1938. 


~ 


d 
son and 22. 
- Women's Work (Anon.). Occupation No- 15 


Int. Lab. Off., Geneva, 1930-1934 


Cross-Validation of Clerical Aptitude Tests 


Edward N. Hay 
A plitude Test Service, Swarthmore, Pa. 


Tests have been used for fifteen years in the 
selection of clerical workers at the Pennsylvania 
Company for Banking and Trusts, Philadel- 
phia. In 1941 a study was made of the pre- 
diction of success in machine bookkeeping,! 
using speed of posting as the criterion. A 
battery of three tests, Number Series Com- 
pletion, Name Finding, and Minnesota Num- 
bers, was found to give excellent reults in 
Predicting success in bookkeeping. 

The present study was undertaken to deter- 
mine whether it was possible to predict success 
S key-punch operators with the same tests. 
pd there appeared to be no objective way 
Tanking key-punch operators according to 
Production because of differences in their jobs, 
Xm. Performance was rated by their super- 
i „The ratings were under the general 
si à of performance," and were divided into 
X levels, with appropriate headings, as follows: 
Mis I. Quantity and quality of production 
ag normal, shows initiative, capable of 
and T greater responsibilities; Groups II 
are ent, Quantity and quality of production 
in le satisfactory, shows some ability 
foe unusual problems, learns new 
Was with average instructions. (This section 
divided into two sub-sections.); Groups 
are oh Quantity and quality of production 
than t Satisfactory, requires more supervision 
sub divid, Average employee. (This group was 
ity ms into two.); and Group VI. Quan- 
h eee of production unsatisfactory. 
With th ing the clerks each one was compared 
teristics others as well as rated on the charac- 
Indicated by the headings just given. 
me ratings of the key-punch operators were 
Who id the tabulating department manager 
Opera e ised clerks engaged in all of the 
Card faut usual to Remington-Rand punch 
Clerks nating equipment. These particular 
Were b employed without experience and 

n trained to operate the Remington- 


mag, 


! gs 
Keeping 5 N. Predicting success in machine book- 
* appl. Psychol., 1943, 27, 483-493. 


153 


Rand key-punch machine, which has a type- 
writer keyboard for letters and a special bank 
of keys for the numbers. Most of these 
operators were under 20 years of age and 
nearly all were under 25. Few had had office 
experience or typing training. 


The Criterion 


Eighty-two key-punch operators were rated 
who had been hired from 1941 to 1944, had 
remained on the job long enough to become 
proficient and to establish records which could 
be rated, and had taken the three tests referred 
toabove. The number of key-punch operators 
rated in each of the six groups was as follows: 
Group I. 6; Group II. 16; Group III. 31; 
Group IV. 13; Group V. 9; and Group VI. 7. 

The supervisor who did the rating was aided 
by several assistants who were group leaders. 
The range of test scores was somewhat greater 
than in normal times because the labor market 
was tight in these years, and it was frequently 
necessary to lower the hiring requirements. 

The tests used in this study were: (a) Min- 
nesota Clerical—Number Checking; (b) Num- 
ber Series Completion;? and (c) Name Finding.? 

The Minnesota Clerical Test needs no de- 
scription here. The Number Series Comple- 
tion test was the form used by Guilford in the 
Nebraska Revision of Alpha and was drawn 
from several Alpha series. The Name F inding 
test is modeled on the number test of IER. 
Clerical. It consists of 25 names on the front 
of a sheet; for example, Allen B. Smith. On 
the back of the sheet are groups of four names 
together: A. C. Smith, A. B. Smyth, A. B. 
Smythe, and A. B. Smith. The subject 
reads the name on the front of the sheet and 
then turns the sheet over in order to check the 
correct choice. This operation is very much 
like that performed by a bookkeeper in turning 


?Obtainable from The Psychological Corpor: 
522 Fifth Ave., New York. 

?Obtainable from Aptitude Test Service, Swarth- 
more, Pa. 


‘ation, 


154 


Edward N. Hay 


Table 1 


Intercorrelations 


Key Punch Study 


Bookkeeper Study 


N-282 N = 39 pea 
i Qo o U 
2 (3) (4) (2) (3) seers 
E No. Name Minn. No. M 
Variable Nos. Series Find. Nos. Series 
= E 8 
(1) Criterion 30 25 26 (1) 51 .56 p 
(2) Minn. Numbers 04 33 (2) 4 35 
(3) Number Series A3 (3) Á 


Írom check or invoice to the correct ledger card 
for posting. 


~ 


r 3 s 
Finding Cutting Scores From Scattergram 
ores of 


i tudy the sc 
Test scores of the 82 key-punch operators = p os oe cee ae were made: 
were correlated with the six-step ratings and nm TA 2 5 Ba SISSA N 2 and Table 
with each other with the results shown in ALAE SIONIS n wae ssible to select 
Table 1. The Doolittle method gave a mul- 3 Shows Name Finding. qx po these scatte 
tiple R of .380 with these three tests. This Critical SCOreS for each test from ta Numbers 
multiple coefficient was disappointingly low, grams. For example, a iccess- A 
and when first obtained caused the project to alone gives a good prediction o = E 40% 
be laid aside as not affording satisfactory pre- score of 130 or more was made by of whom 
diction. However, when a detailed examina- of the criterion group of 82 clerks, d" and 9 
tion of the raw scores was made, it was apparent 27, or 82% of the 33 were rated "goo ssumed 
that good prediction was possible in spite of or 18% were "poor." It has been ‘those in 
the low R. from the wording of the ratings that 
Table 2 
Scatter Diagram Showing Relation Between Ratings of Key Punch Operators and Scores 
on the Minnesota Numbers Test 
ms Scores on Minnesota Numbers Test 55-160 ott! 
, 5- 100- 105- 110- 115- 5- m - 150- 155- ‘ 
Rating Groups — 94 99 104 109 tik 1j) a o E es n o Vet 80 PE “a 
=. 
L Good 1 4 1 1 |; a 
IL. Good 1 3 1 3 ) à 1 3! 
III. 
IL Good 1 | d $ $ s AA» 3 iA 
IV. Poor 1 a 3 3 1 1 1 9 
V. Poor 1 1 3 2 1 i 
VI. Poor 2 1 1 1 1 1 1 p — 
Number “good” 1 1 eee | 
1 1 9 4 232 ale DU 4 
Number “ ” P 5 8 8 1 0 
mber “poor’ 2 1 0 5 1 7 4 3 0 2 1 1 $ ; 
Cumulative “good” 53 x 
5 
Cumulative “poor” — 39 7 z fo 49 40 36 34/27 22 14 6 5 : 0 
Cumulative tota] & 7 mp 7 2 2 13 sjó s 5 3 2 à 3 
Ehi 6 70 60 49 43/33 27 19 9 7 
o Pass 100 96 94 9s - 6 4 
% “Good” of those 8 73 6 5249 33 23 mw ð 
who pass 65 100 
» 66 7 EET i 
66 6 70 6 74 wile 9» mw o n 9? 
Uncertain 19/20 Success 27/6 or 4-5/1 


T 


f) 


Cross-Validation of Clerical Aptitude Tests 15 


I 


Table 3 


Scatter Diagram Showing Relation Between Ratings of Key Punch Operators and Scores 
on the Name Finding Test 


Scores on Name Finding Test 


_ Rating Groups 6 7 S 9 10 ü 12 13 14 15 16 17 18 19 20 21 Tota 
I. Good 1 1 3 1 6 
II. Good 2 4 4 2 2 1 1 16 
III. Good 1 1 7 6 6 4 2 1 31 
IV. Poor 1 1 3 3 2 1 H 1 13 
V. Poor 1 1 1 2 1 9 
VI. Poor i $ 1 7 
Ruther “good” 0 0 o 1 4 9 1 10 7 5 4 1 210 
Number “poor” 1 D 2 4$ 4 2 3$ 7 1 0 2 1 0 1 
arua "mood" 53 53 583 53} 5 48 30 28 18 11 6 2 1 0 
quise “poor” 29 28 26 24 12 15 12 & & & 2 1 1 
"umulativetota] — 82 81 79 77| 73 |65 54 40 23 15 10 4 2 1 
% Pass 100 99 96 94| so |70 66 49 28 18 12 5 2 1 
^0 “Good” of those 
Who pass 65 65 67 69 71 74 72 70 78 73 60 50 50 
Failure 1/8 4/4 Success 48/17 or 2.8/1 
Doubt- 
ful 


ri three rating groups were the “good” 
2 and those in the last three groups were 
and 20 of clerks. There are 53 of the former 
See of the latter. _Of the clerks who scored 
are abes 130 on Minnesota Numbers, there 
c as many poorer ones as better ones. 
allin dingly, the scores below 130 on this test 
md on territory. Above that point 
In ^ of better to poorer is 4j to 1. 
Shown e case of the Name Finding test, as 
win zx Table 3, the ratio of better to poorer 
ies ore above 12) is not so high —2.8 to 1. 
er, except for “uncertainty” with a score 
is a prediction from scores of 11 and less 
ë mu toward “failure,” or poorer clerks, 
iHlereties being 8 to 1 for failure. Another 
critical ce between the tests is seen in the 
ith Mi percentages. The “passing” group 
mirer of Numbers (making 130 or more) 
Which im only 40% of the criterion group, of 
19% of th are good and 18% poor, whereas 
Score on = criterion group exceeded the critical 
the hi h ame Finding and 74% of them were 
tateq Sher rated clerks against 26% lower 
` The coefficients of correlation with the 


rating criterion are .302 and .260, respectively, 
for Minnesota Numbers and Name Finding. 
These examples show how easy it is to find the 
best cutting scores for single tests even though 
the correlations are low. 


Multiple Cutting Scores 


A study of the scattergrams suggested that a 
combination of critical scores from two or three 
tests might be even more effective in discrimi- 
nating between “good” and “poor” key-punch 
operators than the cutting score from just 
one test. The object was to find a combination 
of scores that were passed by a large percentage 
of the clerks rated “good,” and that will elimi- 
nate as many as possible of those rated “poor.” 
In order to do this, it was decided to list the 
scores for individuals by rating groups. Table 
4 shows the scores of the applicants who were 
rated in the second and fifth of the six rating 
groups. The rest of the list is omitted to 
save space. 

Table 5 shows the results of trying different 
combinations of cutting scores. The com bina 


156 


tions listed here are but a fraction of the many 
trials that would theoretically be necessary to 
exhaust all the possibilities of combinations of 
cutting scores. But from a study of the list 
(given in part in Table 4) and the scattergrams, 
these seemed to be the only combinations that 
offered any promise. In Table 5 the critical 
score combinations are arranged in descending 
order, assuming that hiring standards would 


Edward N. Hay 


be lowered successively as the labor «P 
tightened. Combinations of test m E 
this group give better results at all score ei 
than the multiple regression equation € id 
because the distributions are not linear. — il 
best combinations for;different levels of hir 
are shown in Table 6. ial 
In studying the list of scores, it was ng 
that there were several individuals in Gro 


Table 4 
Scores on Tests Taken by Key Punch Operators at Time of Application for Employment ". 
E 3 s 1 m t e top O 

Note: Scores are given for individuals in two of the six rating groups. The scores listed e 
5 columns at the right are some of the combinations of cutting scores on indicated tests which i E combina- 
for trial. An X was placed after the scores of each individual who failed to “pass” the particu ae of go? 
tion of cutting scores. By counting these X’s in a given column, it is possible to show the d ese 5 com 
and poor clerks who would have been eliminated if hiring had been based on these standards. Ü 
binations are representative of the different combinations which were tried. 


E 10 
(1) Minnesota Numbers 110 110 110 A 
(2) Number Series 9 8 7 J ee 
(3) Name Finding 13 12 11 13 
———— Ea 
(1) (2) '(3) 
Group II 
No. 1 136 13 14 
No. 2 150 15 18 
No. 3 116 17 15 x 
No. 4 97 11 13 x X X 
No. 5 141 9 16 
No. 6 110 10 13 
No. 7 141 12 15 
No. 8 126 11 18 
No. 9 186 11 19 
No. 10 126 11 16 
No. 11 139 14 15 
No. 12 125 14 14 
No. 13 142 14 14 
No. 14 114 11 14 
No. 15 110 12 15 
No. 16 142 13 20 
Group V x 
No.1 146 7 15 r - x x 
No. 2 118 8 M 5 * x x 
No. 3 118 8 14 x X 
E 112 10 15 : x 
No. 6 m s 10 ` X = ` x 
"e m y U X x x EX 3 
No. 8 117 6 X X x x 
12 21 £ 
No.3 127 7 : : . f x 15 
Number passed in Group IT X X og 16 7 
Number passed in Group V T E 15 2L] 
5 


p 


Cross-Validation of Clerical A plitude Tests 157 


Table 5 
Possible Selections of Key Punch Operators from Test 
Scores, Using Indicated Cutting Scores on the Given 
Tests and Showing the Per Cent Who Pass and the 
er Cent Rated “Good” of Those Who Pass. 


Note: Nis 82, with 53 rated “good” and 29 rated “poor”. 


Per Cent 
Rated 

Min " . Per Cent “Good” 

N n. No. Name Who of Those 

NOs, Series Finding Pass Who Pass 
110 9 13 59 86 
110 8 12 73 78 
110 9 70 79 
110 13 70 79 
9 13 68 79 
9 12 78 76 
13 79 74 
110 7 1 77 75 
12 89 7 
110 85 70 
8 12 84 72 
9 83 71 


VI who made high scores on all three tests. 
ome of these were found to be “problem 
ases.” Of one it was said, “Hated the work; 
Was transferred to credit analysis." Of an- 
Other, “Hired as an experiment; had only one 
arm. Left because of a nervous condition.” 
OF still another, *Had ability but refused to 
cooperate. Poor health andattitude.” These 
individuals were, of course, left in the sample. 


These comments help to explain, however, why 
it is never possible to get 100 percent success in 
selecting key-punch operators, or any other 
workers, on the basis of test results alone. 


Comparing Results with Previous Study 


It is interesting to compare the results of the 
present study with the earlier study of 39 
bookkeepers in the same organization. Since 
the same clerical aptitude tests were used, the 
value of using multiple cutting scores can be 
studied in this sample also. In the bookkeeper 
group, slightly better prediction was obtained 
by means of the multiple regression formula 
than with multiple cutting scores. The R in 
this case was .70 as found by the Doolittle 
method, the tests which contributed being: 
(1) Minnesota Numbers, (2) Number Series 
Completion, and (3) Name Finding. The in- 
tercorrelations are given in Table 1. Here the 
criterion was actual rate of production, usually 
a more reliable criterion than ratings of super- 
visors. An examination of the results secured 
with multiple cutting scores in the bookkeeper 
group confirms the satisfactory prediction at- 
tained in the key punch group. Table 6 lists 
the successful predictions from the same cri tical 
scores shown in Table 4 when applied to the 
39 bookkeepers. The dividing line between 
“good” and "poor" clerks was taken at the 


Table 6 


Prediction Results for the Key Punch Group and the Bookkeeper Group Using Indicated 
Cutting Scores on the Three Tests ` 
Note: 82 Key Punch Operators, 53 rated “good” and 29 rated “poor”; 39 Bookkeepers, 29 “good” and 10 


u 
Poor” Producers 


-- Cutting Scores Key Punch Group - Bookkeeper Group 

Per Cent Per Cent 

ii : Rated oe Part Rated “Good” 

Minn. Numb. Per Cent of Those 'er Cent of Those 

os Series i Finding Who Pass Who Pass ? Who Pass Who Pass 
110 9 13 59 86 5s 86 
110 =, 13 70 79 79 81 
110 9 Ms 70 79 61 83 
e 9 13 68 79 56 86 
= = 13 79 74 84 79 
— 9 EY 83 72 64 84 
110 _ 81 n 95 73 


158 


production rate of 105, which was the produc- 
tion average of the group at the time the study 
began in 1937; 29 operators achieved 105 or 
better and 10 operators achieved less than 105. 
In Tables 1 and 6 the results of both studies 
are given for comparison and to show how one 
study confirms the other. . 

In view of the way in which the results of 
the bookkeeper group confirm those of the 
key-punch operators it is interesting to note 
that the subjects were quite different in several 
respects. Although the key punch operators 
were mostly inexperienced at the time of hiring, 
the bookkeepers were almost all experienced 
clerks. Their average length of service was 9 
years and 2 months. The bookkeeper study 
took place in the years 1937-1940 and the key 
punch study after 1944. One group operated 
the Burroughs adding-bookkeeping machine, 
and the other was trained to use the Reming- 
ton-Rand key-punch machine. 

Comparison of the two prediction formulae 
shows the differences in the proportionate con- 
tributions of the three tests: 


ward N. Hay 


Bookkeepers: 


Xo = .19 X Minn. Nos. + 1.34 X No. - 
Series + 1.27 X Name Finding + 52 


Key Punch operators: 


X, = 45 X Minn. Nos. + .97 x No. T 
Series + .35 X Name Finding + 0/9 


Summary 
cess 


A study was made of prediction of su ie 


in a group of 82 key punch operators 0n 
basis of a battery of three clerical tests i 
though the Doolittle multiple R was ler 
was possible to make satisfactory predic r rs 
on the basis of a combination of cutting 5C? 

on the three tests. 


1 


the 


They require only fourteen minutes : 
time and predict equally well for exp 
and inexperienced clerks. 


Received A pril 12, 1950. 
Early publication. 


rience 


9 


=” 


A Test Battery for Actuarial Clerks * 


Adam Poruben, Jr. 


Personnel Division, Metropolitan Life Insurance Company 


In June, 1947, the writer was requested to 
validate a test battery for the selection of 
Actuarial Clerks in the Metropolitan Life In- 
Surance Company. At this time, a review of 
the job description of the Acturial Clerk, in 
Cooperation with an Assistant Actuary, re- 
vealed at least three characteristics necessary 
for the performance of this job, namely, mental 
alertness, numerical aptitude, and memory. 
Because of practical considerations, only a 
Small sample of 12 Actuarial Clerks was avail- 
v a this time. Six tests, designed to meas- 
tered e above characteristics, were adminis- 
bic. to this group. The rank correlations 

Ween test scores and overall performance 
on the job indicated that five of these tests 
Might have some value for the prediction of 
Success on the Actuarial Clerk job. The rank 
vorrelations for the individual tests were fairly 
ET but when the raw scores were transformed 
nto standard scores and a composite score was 
Bet for each individual, a rank correlation 
S +.71 was obtained between the composite 
See and overall performance on the job. 
"Wd. ana of the small sample, the writer and 
Hoo) E t decided to use these tests tenta- 
ar until it would be possible to validate the 
this on a Jarger sample. 'The purpose of 
M Paper is to report the results of such 

Study 


The Sample 


rs sample consisted of 125 Actuarial Clerks 
Stud. ad taken the five tests used in the original 
Nen The majority of these Actuarial Clerks 
Cle "oed either Computing or Calculating 
Six S. All of these employees had at least 
Singen of service on the particular job 
iM doing at the time of thisstudy. The 
i length of service on the job for the 
«65 Sroup was 13.5 months with an S.D. of 

Months. The average length of service 


ay 


* 

x. i . 
Cro, eful acknowledgment is made to Mr. T. A. 
H È r. V. G. Christman, Mr. V. A. Lane, and 


` ^^ Rhoades for their assistance in this study. 


with the company for the entire group was 9.5 
years with an S.D. of 8.10 years. 

The Home Office Clerical Jobs of the Metro- 
politan Life Insurance Company are at the 
present time classified into 19 levels, ranging 
from level A to level S. In order to give the 
reader an idea of the complexity and respon- 
sibility of the various jobs represented in the 
above sample, the job level distribution of 
these 125 jobs is given. It is as follows: 15 B's, 
14 C's, 25 D's, 18 E's, 17 F's, 16 G's, 9 H's, 
3 I's, and 8 J's. 


Description of the Tests 


1. Otis Self-Administering Test of Mental 
Ability. This test is well known and need not 
be described in detail. The 30-minute time 
limit was used. The author reports a test- 
retest reliability of .92. 

2. L.0.-M.A. 4-M Test. This test is pub- 
lished by the Life Office Management Associa- 
tion and consists of 39 problems: 11 percent- 
ages, 10 fractions, 7 decimals, and 11 numerical 
reasoning problems. It has a time limit of 
one hour. The corrected odd-even reliability 
of this test for the 125 sample used in this study 
was +.91. A test-retest reliability coeff- 
cient of +.86 was also obtained for this test 
on a small sample of 24 company employees. 

3. Ratio-Proportion Test. This test was con- 
structed by the writer and consists of 11 
problems involving the use of ratio and pro- 
portion. Three of the problems call for simple 
interpolation of tables since considerable inter- 
polation of tables is done on most of the 
Actuarial Clerk jobs represented in this study. 
It hasa time limit of 27 minutes. Its corrected 
odd-even reliability was found to be +.77 for 
the 125-case sample of this study. 

4. Logical Memory Test. This test consists 
of a short story written in a single paragraph. 
The various ideas in this story are separated by 
diagonal lines to facilitate memorization. The 
subject reads this story for two minutes and 


159 


160 


then reproduces from memory as much of the 
story as he can. The score is the number of 
ideas correctly reproduced. No estimate of 
reliability was possible for this test. 

5. Wesman Personnel Classification Test. 
Only Part II of this test was used. This con- 
sists of twenty numerical problems. It has a 
time limit of 10 minutes. Its corrected odd- 
even reliability for the 125-case sample studied 
was +.85. The author reports a reliability 
coefficient of .82 for 174 chain store clerks. 


The Criterion 


The criterion consisted of ratings obtained 
on an experimental rating form. This form 
consists of eight traits: Knowledge of Work, 
Quality of Work, Quantity of W. ork, Ability to 
Learn, Cooperation, Interest in Work, Atten- 
dance, and Punctuality. Ratings on the first 
Six traits and an over-all rating based on all 
traits were used as criteria in this study. The 
trait ratings are obtained on a five-degree 
graphic scale. The scales on Knowledge of 
Work, Quality of Work, and Quantity of 
Work have a range of 1 to 20; those for Ability 
to Learn, Cooperation, and Interest a range of 
1 to 10; those for Attendance and Punctuality 
a range of 0 to 5. The over-all rating is ob- 
tained by summing the point ratings on all of 
the eight traits. 

This experimental rating form was designed 
by the Company in 1948 in order to improve 
the employee-evaluation procedures. 
tried out on 3,275 non-supervisory clerical em- 
ployees in September 1948 and on 8,876 non- 
Supervisory clerical employees in February 
1949, The results were quite satisfactory. 
The mean and S.D. of the September 1948 over- 
all ratings was found to be 69.8 and 11.16 re- 
spectively; the mean and S.D. of the over-all 
ratings on the 8,876 sample was found to be 
69.5 and 11.26, respectively. Both of these 
experimental runs were first explained to the 
Company managers by the Pe 


It was 


Adam Poruben, Jr. 


time before these experimental runs. Also the 
managers had no knowledge of the fact that a 
second experimental run would be made at the 
time of the first run. Therefore, it is safe to 
say that the two ratings on the 3,275 employees 
were made independent of each other. 

As was mentioned before, two ratings were 
available for 3,275 employees. From this 
group, 2,210 were selected who were still on 
the same job at the time of the second rating 
as the first rating. These 2,210 employees, 
therefore, had at least six months experience 
on their job at the time of the first rating and 
at least one year at the time of the second 
rating. Most of the jobs held by these 2,210 
employees are such that the employee can be 
trained to do the work in a few months. 
Thus, at the end of six months the employee 
should have attained enough skill on his or 
her particular job that fairly accurate evalua- 
tion of his or her work is possible. When the 
two sets of over-all ratings for these 2,210 em- 
employees were correlated a coefficient of re- 
liability of +.69 was found. 

The means and S.D.’s of these two sets of 
ratings for the 2,210 employees were not sta- 
tistically significantly different. The first 
ratings had a mean of 69.69 and an §.D. of 
10.97; the second ratings had a mean of 70.35 
and an S.D. of 11.08. 

For obtaining estimates of reliability of the 
trait ratings, the ratings of the employees from 
the Actuarial Division who were rated twice 
Were used. It was found that there were 171 
such employees who were rated twice, who 
Were on the same job at the time of the second 
rating as the first, and who had at least six 
months experience on their job at the time of 
the first ratings. The rate-rerate coefficients 
of reliability were found to be as follows: 
+.64 for quality of work, +.65 for quantity 
of work, and +:71 for ability to learn. 


Results 


In order to see if the tests differentiated be 
tween the outstanding and the poor workers» 
the test scores of the thirty-two (25.6%) e™ 
Ployees with the highest over-all ratings we? 
Or pared with the test scores of the thirty 
: e) employees with the lowest over 
2755. The results are shown in Table 1 
Distributions for all variables were draw” 


—À € —— 


! 


A Test Ballery for Actuarial Clerks " 161 


Table 1 


Average Test Scores and Ratings for the Best and Poorest Employees 


Upper Group 


Lower Group 


S.D. Mean S.D. 1 


Variable Mean 
Ratio-Proportion 7.7 2.74 5.9 2.40 4.32* 
L.0.-M.A. 4M 23.6 9.38 18.1 7.71 247* 
Personnel Classification 12.6 3.95 10.7 3.64 2.00** 
Otis Mental Ability 54.4 11.57 49.2 11.09 1.77 
Logical Memory - 46.1 9.22 44.6 10.73 60 
Over-all Ratings 82.9 3.18 58.5 6.84 17.69* 


* Significant at the one per cent level. 
He - 
* Significant at the five per cent level. 


and found to be approximately normal. The 
Scatter plots between the variables indicated 
that the use of the Pearsonian product-moment 
Correlations was permissible. The means, 
S.D.’s, test intercorrelations and validity coeffi- 
cents are shown in Table 2. 

" The correlations between the tests and rat- 
Ings on Cooperation, Interest in Work, and 
Knowledge of Work were not significant at the 
one per cent level and, therefore, are not shown 
in Table 2. 

Two more correlations were found which do 
not appear in Table 2. One of these was be- 
tween length of service with company and 
over-all ratings. It was found to be.12. The 
other was between over-all ratings and the 
Job level. This was found to be .32. 

The Wherry-Doolittle technique was used 
to combine the tests into batteries. All the 
test-criterion correlations were corrected for 
attenuation in the criterion before this tech- 
nique was applied. The only combination that 
gave a higher validity coefficient than the 


single tests was the combination of the Ratio- 
Proportion and Wesman Personnel Classifica- 
tion Tests which resulted in a shrunken mul- 
tiple R of .457 and Beta weights of .333 and 
.159, respectively, with the criterion Ability 
to Learn. Apparently there is considerable 
overlap among the tests. 


Discussion 


It can be seen from Tables 1 and 2 and the 
results given in the last section that the Ratio- 
Proportion Test is the best predictor since it 
correlated significantly with ratings on Quality 
of Work, Ability to Learn, and with over-all 
ratings. The L.O.-M.A. 4-M came out about 
the second best in that it was found to correlate 
significantly with the ratings on Quality of 
Work and Ability to Learn. The Wesman 
Personnel Classification Test also correlated 
significantly with two of the criteria of job 
success—Quality of Work and Ability to 
Learn. The Otis was found to correlate sig- 
nificantly with only one of the criteria, namely, 


Table 2 
Test Intercorrelations and Validity Coefficients for 125 Actuarial Clerks * 


Mean S.D, Description of Variable B C D E E G H I 
11.4 3.78 Personnel Classification A 4 kd .82 74 E 23: A5 34 
52.0 1157 Otis Test of Mental Ability B 56 78 77 AT 20 14 32 
45.6 ^ 999 Logical Memory Cc 36 41 10 412 08 20 
199 861 L.O-M.A. 4M (Arithmetic) D 30 21 25 20 34 
m 2./0 ^ Ratio-Proportion E 26 30 21 38 
141 9.89 Total Rating F 92. 84 77 
135 2.57 Quality of Work G NG 78 

73 2.34 Quantity of Work H 59 
s 1.21 I 


Ability to Learn 


* 
A correlation of .23 is significant at the 1% level. 


162 Adam Poruben, Jr. 


Ability to Learn. The Logical Memory Test 
showed no significant correlations with any of 
the seven criteria used in this study. 

In view of the above results, it can be safely 
concluded that the Ratio-Proportion, the 
L.O.-M.A. 4-M and the Wesman Personnel 
Classification Tests are valid for the prediction 
of success on the Actuarial Clerk job. Al- 
though the validity coefficients are not high, 
it can be stated that only a relatively small 
percentage of the employees tested, those with 
the higher test scores, are considered for selec- 
tion and placement after their other personnel 
records such as years of service, attendance, 
former ratings, etc. are reviewed. Under these 
conditions even relatively small validity coeffi- 
cients have some predictive value. 

It is not surprising that the tests did not 
show any validity for the prediction of behavior 
included under the trait Cooperation nor for 
Interest in Work. The tests were designed 
to measure aptitudes and not personality 
characteristics or interest. The results bear 
this out. 

The analysis in connection with the multiple 
correlation work indicated that there is con- 
siderable overlap among the tests. Tt appears 
that a two-test combination consisting of the 
Ratio-Proportion and the Wesman Personnel 
Classification Tests has as good or better 
validity than all five tests combined. The 
Logical Memory Test definitely does not add 
to the efficiency of prediction. This is prob- 
ably because of its poor reliability due to 
subjective scoring. Also it appears that what- 
ever is measured by the Otis is just as well 
measured by the L.O.-M.A. 4M, Ratio-Pro- 
portion and the Wesman Personnel Classifi- 
cation Tests. This is not too surprising since 
fairly large portions of these tests, especially 
the first two, consist of numerical reasoning 
problems. 

Summary 


A battery of five te: 
mental alertness, 
memory, 


sts, designed to measure 
numerical aptitude and 
sample of 125 


of 7.65 months. The average length of service 
with the Company for the entire group was 
9.5 years with an S.D. of 8.10 years. Their 
job levels ranged from B to J. : 

The criterion consisted of ratings on six 

traits and an over-all rating. The reliability 
of the over-all ratings was estimated by ob- 
taining a correlation between two ratings on 
2,210 employees, the two ratings being five 
months apart. All of these employees were 
on the same job during both ratings, had at 
least six months experience on their job at the 
time of the first rating and were rated by the 
same supervisor at both times. This correla- 
tion was found to be 4-.69. Another measure 
of the stability of the over-all ratings was ob- 
tained by finding the means and S.D.’s of the 
two sets of ratings. The first ratings had à 
mean of 69.7 and an S.D. of 10.97; the second 
ratings had a mean of 70.4 and an S.D. of 11.08. 
The reliability coefficients for the traits on @ 
sample of 171 employees, ranged from +.64 
to 4-.71. . 
. Two of the tests were found to differentiate 
significantly between the best 25 per cent 
and the worst 25 per cent of the employees: 
These were the Ratio-Proportion and the 
L.O.-M.A.4-M Tests. The Wesman Personnel 
Classification Test differentiated significantly 
between these two groups at the five per cent 
level. 

The Ratio-Proportion Test was found to be 
the most valid, having significant correlations; 
at the one per cent level, with three of the seven 
criteria of job success, namely, Quality 9 
Work, Ability to Learn and over-all ratings: 
The L.O.-M.A. 4-M was the second best Rn. 
dictor having significant. correlations wien 
Quality of Work, and Ability to Learn. The 
Wesman Personnel Classification Test also noe 
related significantly with two of the Hom 
criteria, namely, Quality of Work and Ability 
to Learn. T 

Great overlap was found to exist among t 
five tests. On the basis of the multiple io 
relation results it appears that the combinati? 
consisting of the Ratio-Proportion an 
Wesman Personnel Classification Tests has 


good or better validity than all five Ke 
combined. 


as 
tS 


Received March 2, 1950. 
Early publication, 


f 


ewe 


"hc. A2 e. Y 


i 


Changes in Subjective Fatigue and Readiness for Work 
During the Eight-Hour Shift d 


John W. Griffith, Willard A. Kerr, Thomas B. Mayo, Jr. 
Illinois Institute of Technology 


and 


john R. Topal 


Belden Manufacturing Com pany 


" Although remarkable progress has been made 
in industrial psychology in recent decades, it 
is an interesting fact that remarkably little 
research has been reported on the changes 
Which probably occur in the subjective fatigue 
and readiness for work of personnel from one 
Part of a standard work shift to another. As 
Ryan (5) has pointed out, most “fatigue” re- 
Search has reflected an academic preoccupation 
either with trying to measure objective fatigue 
or attempting to define fatigue with precision, 
the latter task being one which Muscio (3) 
decided as early as 1921 to be practically 
hopeless, Because of the relative absence of 
literature on the subjective feeling changes 
with work, the problem is largely unmentioned 
In existing textbooks on personnel and indus- 
trial psychology. 

Production curves for the work day in 
Various factory operations sometimes are pre- 
e for their possible relevance (1, 4) to 

igue, but it is admitted generally that many 
Actors other than fatigue, however defined, 
ee the production curve. Feelings of 
S edness do not necessarily change in expected 

Irections with changes in rate of output. 

owever, to date, the scheduling of work and 
Ocating of rest pauses in industry have been 
n largely according to guesswork or with 
i erence to incidental practical considerations 
"dependent of worker readiness. 

f considerable consistency is found in worker 
rings of tiredness and readiness to work in 
it i types of work, it is possible that such 
telle curves will be useful in the more 1n- 
;- IBent scheduling of work and rest pauses 
usiness and industry. 


* Seni 
Mor author is W. A. Kerr. 


The Present Research 


The present research entertains the hypothe- 
sis that employees in representative types of 
work possess definite attitudes as to when 
during the work shift they are most ready to 
work and when they are most tired. Em- 
ploying the "tear ballot" technique (2) a 
measuring device was constructed which ob- 
tains from the worker his estimate of when in 
each half of the eight-hour shift he feels most 
rested and most tired. All replies of 379 em- 
ployees were anonymous except that eleven 
ballots were temporarily coded for a crude 
test-retest reliability check in addition to other 
internal consistency evidence. Included in 
the sample were 232 male manual workers 
(handlers and sorters of light-to-100-pounds 
materials). 75 foremen in a rawhide factory, 
and 72 office workers (48 male, 24 female). 
Foremen were measured at a regular meeting 
of the Chicago Rawhide Management Club 
and the other personnel were measured while 
at work in a distributing organization (manual 
workers) and in an electronics plant (office 
workers). The supervisory and office per- 
sonnel were regular day shift employees but 
the 232 manual workers began their various 
shifts in the three-hour period from 3:30 to 
6:30 p.m. o 


Results 


Obtained subjective reports of tiredness and 
readiness for work at various hours of work 
were analyzed with respect to age, sex, and 
type of work performed. Repeat-test relia- 
bility coefficients for small groups ranged from 
.69 to .92. 

Age. Office and manual 


163 


workers were 


164 


studied separately as to possible age ten 
in work feelings. Each group was divide 
into younger (20-35) and older (36-65) e 
sonnel and per cent of each age group fee - 
tired (and rested) at each of the eight. hours o 
the shift was calculated. Considering each 
half of the work shift separately, normal 
chance expectancy, assuming no change in 
work feelings with successive hours of work, 
would yield 25 per cent response at each hour 
of each half of the work shift. Actually, con- 
spicuous changes in both tiredness and readi- 
ness for work are reported by both older and 
younger workers in successive hours of work. 
Older workers, both office and manual, report 
greater average feeling deviations from chance 
expectancy then do younger workers. This 
tendency, shown clearly in Table 1, seems to 
indicate that older workers are introspectively 
more conscious of feelings of tiredness and 
readiness for work at specific hours of the work 
shift than are younger workers. It is possible, 
of course, that “objective” fatigue is equally 
present in the younger workers but that the 
younger workers are less affected in their sub- 
jective feelings by their continually changing 
organic conditions than are older workers, 
Another explanatory hypothesis is that younger 
workers simply have less insight into their 


feeling changes with successive hours of work. 


J. W. Griffith, W. A. Kerr, T. 


B. Mayo, Jr., and J. R. Topal 


Whatever the most tenable explanation, older 
workers in this study report significantly 
greater extremes of work feelings than do their 
ger associates. 
— Since all the manual workers are mgle, 
the sex comparison is limited to office ec 
—48 males and 24 females—groups too sma 
for any except suggestive comparisons. A 
suggestive tendency is present for female em- 
ployees to report greater extremes of tiredness 
than do males. 
Type of Work. Curves of work feeling 
throughout the work spell for manual work, 
office work, and supervising are displayed in 
Figures 1 and 2. It is significant that these 
three curves in each graph are all highly 
similar, despite the fact that they are derived 
from reports of employees doing dissimilar 
types of work and in different firms and in- 
dustries. Introspectively, apparently, workers 
of widely differing types experience substan- 
tially the same feelings of tiredness and readi- 
ness for work at specific periods in the work 
spell. The similarity of these curves is all the 
more striking when it is considered that the 
manual workers are “swing” shift rather than 
regular day shift personnel. Extent of sub- 
jective tiredness feelings appears from Figure 
2 to be in part a function of degree of manual 
efiort involved in jobs Performed. Supervisors 


Table 1 


Per Cent of Each Age Group Among Manual and Oi 
“Most Rested” at Each Hour of th 


fice Worker: 


s Reporting Feelings of 


" A “Most Tired” and 
of Such Reports from Normal Ean che Mean Deviations i 
Per Cent “Most Tired Teresa “Moet Badii coca Row 
Manual Office Manual Office Charite - 
Hour of Expec 
Work Age Age Age Age Age Age Age Age tancy 
Shift 20-35 36-65 20-35 36-65 20-35 36-65 20-35 36-65 25 
1 32 20 27 41 27 38 31 6 0.95 
2 5 7 3 18 36 29 51 53 25 
3 16 18 18 0 20 15 15 35 25 
4 47 55 52 41 17 18 3 6 100 
: 100 — 100 100 — 100 100 100 100 — 100 25 
: 3 16 335 w 28 B 25 24 25 
7 si 2 9 12 35 4l 42 52 25 
: : 37 2 a5 2 ig a n 25 
7 45 29 41 13 12 9 12 100 
100 — 100 100 100 0 0 
M» m» i n Q 100 — 100 100 — 10i 
N 2 ox 2 145 65 103 123 163 
55 17 84 148 55 17 


| Subjective Fatigue and Readiness for Work 


MANUAL WORKERS 
OFFICE WORKERS  ———— 
P 48 SUPERVISORS CES 


PERCENT 


1 2 3 4 5 6 7 8 


HOUR OF SHIFT 


Fic. 1. Per cent of manual, office, and supervisory employees reporting maximal feelings of restfulness 
at each hour of each half of the eight-hour work shift. 


MANUAL WORKERS 
8e OFFICE WORKERS ———— 
SUPERVISORS — € 


PERCENT 


i 2 3 4 5 


| i HOUR OF SHIFT 
Fic. 2. Per cent of manual, office, and supervisory employees reporting maximal feelings of tiredness 
at each hour of each half of the eight-hour work shift. 


166 J.W. Grifith, W. A. Kerr, T. B. Mayo, Jr., and J. R. Topal 


show minimal variation about the line (25 per 
cent) of chance expectancy while manual 
workers show maximal variation for the three 
groups studied. Period of maximal tiredness 
seems to be the hour preceding the lunch 
period, while another peak of tiredness is 
during the last hour of the work shift. Readi- 
ness for work in terms of per cent of personnel 
reporting themselves as “most rested” is 
maximum in the second hour after the be- 
ginning of each work spell and it is minimal 
in the last hour of each work spell. 


Summary 


Manual, office, and supervisory employees 
totalling 379 from three different establish- 
ments were measured with a Kerr “tear ballot” 
for subjective feelings of tiredness and restful- 
ness in the various hours of the eight-hour 
work shift. 

1. Manual, office, and supervisory personnel 
report significantly differential feelings of tired- 
ness or restfulness for various periods in the 
work shift. 

2. Older workers report significantly greater 
variation of such feelings than do employees 
under age 36. 

3. Curves of tiredness feeling and restfulness 
feeling throughout the work shift are remark- 
ably similar for the manual, office, and super- 
visory employees in this study. The similari- 
tiesare more impressive than the dissimilarities. 


4. Maximal subjective fatigue is reported 
in the fourth and eighth hours of the eight- 
hour shift. oo å 

5. Maximal restfulness feeling is reporte 
in the second and sixth hours of the shift, the 
second hour of each four-hour work spell. E 

6. In possible future evaluation of ihe prn 
chological and efficiency advisability «es 
six-hour day, it is recommended that such ES 
as these reported here be obtained oe o 
ployees now engaged in six-hour shifts. Mes 
new data should be examined particular y ei 
(a) less variability of tiredness feeling pease 
and (b) relative absence of high Hueines I vit. 
just before the middleand end of the wor 5 


Received September 16, 1949. 


References 


F.S. 
1. Goldmark, M. D., Hopkins, P. S. F., and «1 elio 
Studies in industrial physiology: fatis an eight- 
tion to working capacity: comparison 9 A public 
hour plant and a ten-hour plant. 
Health Service, Public Health Bu 
1920. NE 
2. Kerr, W. A. Where they like to work; ae of 
preference of 228 electrical workers 442. 
music. J. appl. Psychol., 1943, a British J- of 
3. Muscio, B. Isa fatigue test possible? 
Psychol., 1921, 12, 31-46. T 
4. Rothe, H. F. Output rates among butte 
I. Work curves and their stability ^ 
Psychol., 1946, 30, 199-211. sork: Rona 
. Ryan, T. A. Work and effort. New York 
Press Co., 1947. 


iletin No. 19% 


wrappers’ 
P 


pl- 


[a 


Accident Proneness of Factory Departments * 


Willard A. Kerr 
Illinois Institute of Technology 


The extent to which individual accident 
Proneness exists or has been a determinant of 
Physical casualties in industry plainly has been 
exaggerated by many earlier authoritative 
Writers according to more recent evidence (3, 

Much factory data which appear at first 
examination to indicate that certain employees 
are persistent “repeaters” and therefore "'acci- 
dent prone" fail to substantiate such conclusion 
upon detailed probability study. Accidents 
distributed by chance (under the theory that a 
Certain approximate number are inevitable 
Under the existing total work situation in a 
factory department) will supply some workers 
with no accidents, some with one, some with 
two, and a few with even three or more (7). 
Because such analysis actually does succeed 
In most factory experience in explaining much 
of the individual employee “repeat” accidents 
data, the time-honored approach of the psy- 
chologist and psychiatrist (4) which emphasizes 
identification of subtle personality conditions 
Which predispose to accidents by some em- 
Ployees seems to be a less promising approach 
than that which emphasizes study of the total 
PSychological climate in which the typical em- 
Ployee of a group works. If proneness (or lia- 

ility) to accidents exists such tendency may be 
& group psychological phenomenon as well as 
an Individual psychological phenomenon. 
he fact that intelligent safety engineers and 
Industria] training personnel working with in- 
dividuals and equipment often are unable to 
take some factory departments out of the 
„accident prone” column even after years of 
Intense effort is proof that many group (as well 
as individual) psychological conditions may be 
operating, 
The Present Study 


Subjects for this study were 53 accident 
Prone and non-accident prone departments in 


: H 
in oi cknowledgment of invaluable advice and assistance 
et, Ing the accident data is made to O. C. Boileau, 
and © Department, Radio Corporation of America, 
constr ean F. H. Kirkpatrick, Bethany College, for 
Tuctive criticism. 


the Camden Works of RCA involving 12,060 
employees. These data were collected in 1943. 
Forty other variables were investigated in each 
department. 

Accidents per hundred workers per year for 
these departments ranged from 0.0 to 22.7, 
although 38 of the departments had rates of 
less than four accidents per 100 workers. 
Severity ratings, based largely on days lost 
from work, also were obtained for each depart- 
ment with the advice of the plant safety 
director. These severity values ranged from 
0 to 75. 

It is only because of the grave importance of 
the objective that such unpromising potential 
correlates of accidents as some of these reported 
in this study were investigated. Of the forty 
variables studied only a few were significantly 
related to accidents, as expected, yet at least 
two of these results have not been reported 
previously in accident literature; therefore, 
they may justify the entire investigation. 

Because both accident variable distributions 
were positively skewed and several of the 
variables studied consisted of dichotomous or 
two-interval data, the tetrachoric coefficient 
of correlation (2) was employed. The statis- 
tically significant correlations (five per cent 
level) in Table 1 are indicated according to 
use of Kelley's reliability formula (6) and the 
Guilford-Lyons tables (5). 

Inspection of these significant correlations 
reveals that accidents tend to occur with 
greatest frequency in those factory depart- 
ments with lowest intra-company transfer 
mobility rates, smallest per cent of employees 
who are female and on salary, least promotion 
probability for typical employee, and highest 
mean noise level. 

While departments highest in accident fre- 
quency usually also are above average in acci- 
dent severity, the severe accident departments 
have some systematic characteristics which are 
found less often in the high accident depart- 
ments. High severity departments are heavily 


167 


168 


Willard A. Kerr 


Table 1 iiia 
P veri idents in 53 Factory Departments and Eac | 
Correlations between the Pr cen nin ae sony a Jersey Factory* Mna a: j 
Accident Aue 
Frequency — 
Variable 36 T 
1. Number of production employees 18 = j : 
2. Total employees 7 24 ES 
3. Per cent of employees who are male, production 20 " 
4. Per cent of employees who are male, salary 28 $ s | 
5. Per cent of employees who are production workers 46 2 : | 
6. Production employees per supervisor : ii 42 = " 
7. Mean hours worked per week per production male 30 -. : 
8. Mean hours worked per week per production female 97 a 
9. Mean base pay of production males ‘07 - j^ 
10. Mean base pay of production females 2 bin 
11. Sex hours differential, mean 3l - er q 
12. Sex wage differential, mean id a . 
13. Intra-company transfer mobility " * : 
14. Sex-ratio imbalance . —42 - 7 f 
15. Gross turnover rate (including accessions) — 06 A 
16. Avoidable turnover rate (including accessions) "12 E | 
17. Avoidable separation rate B T | 
18. Per cent of employees who are salaried male — 40 P^ 
19. Per cent of employees who are salaried female E — 08 r^ 
20. Per cent membership in company athletic association Bn ch 
21. Accident frequency " " ; 
22. Accident severity f Q6 : 
23. Efficiency (plant manager rating, three-month period) — 09 pe 
24. Efficiency (mean rating of ten competent judges) S ‘05 E 
25. Mean job security (mean rating of twelve competent judges) T n 
26. Mean supervisory quality (mean rating, twelve judges) L30 ^ 
27. Mean job prestige (mean rating, twelve judges) — 40 uw | 
28. Mean promotion probability (mean rating, twelve judges) 13 o5 
29. Mean job monotony (mean rating, twelve judges) . —07 "m 
30. Degree of completion of Work (rating, suggestions supervisor) ie E 
31. Fertility of suggestion field (rating, suggestions supervisor) 20 ‘a 
32. Suggestion quota (established by suggestions supervisor) " | 
33. Total suggestions submitted ps Ec 
34. Per cent of Suggestion quota met eut j J 
35. Per cent of Suggestions adopted 4 -3» 
36. Wage incentive System 00 AS 
37. Mean noise level p eii 
38. Labor-management mean morale rating (mean of 39 and 40) 200 -3 5 
39. Morale as rated by personnel manager B : C 
40. Morale as rated by union local officers (pres. and vice-pres.) = 23 e 
41. Youthfulness of employees (per cent under 26) = e | 
42. Tenure (per cent employed more than twelve months) 3 
Coefficients in bold face are Statistically significant at the five per cent level or better ay are 
p ; " ubte Cs 
male in "s ratio for salary as well as production Most of these correlations d fi am 1 
ow in employee suggestions conti field, However, a few are worthy ntia 
(relatively) i nbuted, high 


= «| bsta 
and interpretation. Possibly au 

n average employee age level, and Company transfer mobility ™ n 
erage employee tenure, more alert and interested in 


| 
ec? 
mpl 
ae work " 

| 


f». d 


Accident Proneness of Factory Departments 169 


vironment, resulting in fewer accidents. The 
cross-fertilization of ideas which probably ac- 
companies intra-plant mobility may act also 
to reduce accident hazards and promote posi- 
live cooperation with safety personnel. 

The tendency for departments lowest in 
promotion probability to be high in both 
accident severity and frequency may be of 
considerable psychological significance. It is 
plausible that when promotion is too unlikely, 
the typical employee may develop accident 
Prone atlitudes of relative indifference to the work 
environment. A reasonable chance to get ahead 
may constitute an incentive which not only 
stimulates the employee to do better work but 
may make him more alert to avoid hazards 
Which may detain him in his progress. 

Accident prone departments usually have 

above average noise levels. Whether the 
noise level is causal of accidents or merely an 
incidental correlate of hazardous factory opera- 
lions is not entirely clear; it appears to be 
both causal and incidental. Certainly the re- 
duction of excessive noise levels whenever 
Such reduction is practicable can be expected 
to do more good than harm as regards accident 
records, 
; The pattern of correlates of accident severity 
is somewhat different from that of accident 
frequency. As might be expected, maleness 
is a marked characteristic of severe-accident 
departments; probably females are rarely 
placed on the most dangerous jobs or in the 
Most “strenuous” departments. Also, male 
employees tend to be older; Chaney and Hanna 
(1) found that the probability of fatal or dis- 
abling results is greater among older than 
àmong younger accident cases. 

Less easily explained, however, is the fact 
that severe accident departments are units 
Which tend to show a poor performance in 
Contributing to the plant suggestion system. 
Superficially, it appears that departments 
which lag in making constructive suggestions 
through the employee suggestion boxes lag 
also in correcting dangerous conditions and in 
Passing tips around on how not to get hurt; 
the superficial interpretation easily may be 
the valid one. 

Another tenable hypothesis is that the aver- 
age “foresight factor" of intelligence is lower 
n the severe accident departments because 


foresightful employees tend to avoid or transfer 
away from dangerous work departments. A 
third hypothesis, also possibly tenable, is that 
severe accident departments are those which 
have been so highly systematized and per- 
fected from the industrial engineering stand- 
point that the average worker feels no incentive 
to try to improve the work or workplace 
through either employee suggestion boxes or 
alertness to unexpected hazards. While this 
latter hypothesis is improbable, it does seem 
highly significant that departments which are 
high in suggestion fertility are low in accident 
severity. 


A New Frame of Reference 
for Safety Promotion 


Perhaps, as some of the correlations in this 
study seem to suggest, a fundamental change 
in the total psychological frame of reference 
in which the average employee works is the 
basic key to reduction of industrial accidents. 
This probably can produce the probability that 
fewer total accidents will happen. 

A psychological work environment that re- 
wards the worker emotionally for being alert, 
for seeking to contribute constructive sugges- 
tions, for passing a tip to a co-worker on how 
best to do something or how not to get hurt 
appears from this research to be a profitable 
goal to work toward. Creating or promoting 
such an environment undoubtedly calls for a 
much broader perspective of approach to acci- 
dents than has hitherto been considered by 
most managements. 

An additional item of circumstantial proof 
that that which promotes alertness also tends 
usually to minimize accidents is the fact in this 
study that the departments with incentive-pay 
systems have no more accidents than other de- 
partments. Approximately half of the depart- 
ments studied are on incentive systems. These 
same departments are "problem" departments 
in many respects (higher turnover, more 
monotonous work, less job prestige, and less 
promotion probability). In fact, they have 
almost all undesirable characteristics except 
accidents in greater quantity than do the non- 
incentive departments. The “normal expec- 
tancy” record as regards accidents in incentive 
departments practically is in defiance of 


170 


physical and even some psychological work 
conditions. Even though incentive systems 
rarely succeed as much as they theoretically 
should in motivating the worker, they never- 
theless appear to make him more alert to attain 
a reasonable productive goal and this alertness 
apparently makes him safer in his operations. 
These observations on incentive systems are, 
of course, somewhat speculative. However, 
the need for providing emotional rewards for 
alertness seems highly probable from this 
research. Such rewards might include eco- 
nomic rewards, prestige-building honors, extra 
privileges, and representation on special com- 
mittees and councils. These rewards held as 
attainable goals by workers in “dead end” jobs 
should operate to raise the average level of 
alertness, not just to hazards but to everything. 


Summary 


Accident severity and frequency were cor- 
related with each of forty other variables in 
the 53 departments of an electronics factory. 

1. Accident frequency is associated with low 
intracompany transfer mobility, small per cent 
of employees who are female and on salary, 
low promotion probability, and high noise level. 

2. Accident severity is associated with pre- 
dominant maleness, low promotion probability, 
low fertility of suggestion field, low suggestions 
record, non-youthfulness of employees, and 
high average tenure of workers, 


Willard A. Kerr 


3. A common explanatory factor among tie 
accident frequency correlates appears i ue 
depressants to alertness. The same factor na 
pears to be present in most of the severity 

relates. . 
nas Daum should direct increased eme 
toward enlivening of the psychological - 
environment, particularly with reference. : 
provision of more and more emotional pem 
goals as incentives to raise the average 'e 
of alertness. 


Received August 15, 1040. 


References 


1. Chaney, L. W., and Hanna, H. S. 
ment in the iron and steel industry. 
Statistics Report 234, 1918. p. Com- 

2. Chesire, L., Saffir, M., and "Thurstone, L. ficient of 
puling diagrams for the letrachoric coeff Book- 
correlation. Chicago: Univ. of Chicago 
store, 1933. 

3. Cobb, P. W. The limit of usefulness o ed 
rate as a measure of accident pronen 
appl. Psychol., 1940, 25, 154-159. + aed 

4. Dunbar, Flanders. Medical aspects O! * 


The safety move- 
Bur. Labor 


f accident 


coefficient of correlation. 

7, 243-249. p 
6. Kelley, T.L. Statistical method. New Y ‘ 

millan, 1924. -Amination © 
7. Mintz, a and Blum, M. J. A opi ; " 

the accident proneness concept. 

chol., 1949, 33, 195-210. 


ork: Mac 


es 


b- 


The Rank-Comparison Rating Method 
Reign H. Bittner 


The Prudential Insurance Company of America 


and 


Edward A. Rundquist 
Personnel Research Section, AGO 


Validating tests or other measures of apti- 
tude for an industrial job always begins with a 
search for an adequate criterion. A variety of 
measures have been tried as the yardstick of 
job success: turnover, absenteeism, medical 
visits, production records, merit ratings and 
Specially devised ratings. Turnover, absen- 
teeism and medical records may be adequate 
criteria for validating personality tests or ap- 
plication blank material. These types of 
criteria will not be discussed here as our con- 
cern is with predicting ability to learn to do 
a job. 

Production measures would appear to be 
the ideal criterion for validating tests of apti- 
tude. Unfortunately, however, production is 
affected by so many things not under the con- 
trol of the individual that such records rarely 
reflect accurately what an individual can do. 
They are affected by variation in the quality 
of the material handled, by the pace of the 
machine, by the pace of others in the work 
group, by the correlation of job assignment 
with length of service, and even by the correla- 
tion of job assignment with ability to do the 
Job. These and other factors operate almost 
Universally to make production records inade- 
Quate as a criterion of ability to perform a job. 
Even when such records are collected over a 
long period of time the factors mentioned do 
not necessarily average out. 

Merit ratings are often available as a possible 
Yardstick of job success. However, upon ex- 
amination they usually turn out to be of little 
use. First, merit rating procedures usually 
have several purposes only one of which is to 
Set a measure of the person's ability to do the 
job. Other purposes such as aiding the super- 
Dir das deal more effectively with his people, 
m ding morale, etc., affect the ratings In ways 

aking them undesirable as criteria. Second, 


171 


the distribution of merit ratings is often so 
skewed that their use as a discriminative 
criterion is ruled out. Third, most merit 
ratings are collected under such uncontrolled 
conditions and with so little training of the 
raters that they become valueless as criterion 
measures. It should be noted, however, that 
these defects are not necessary as both Fer- 
guson (5) and Mahler (7) have shown. 
Specially devised ratings are commonly used 
as criteria since merit ratings and production 
measures cannot be relied upon to furnish a 
good measure of individual performance. 
These special ratings have been of many kinds. 
Probably most common are the variations of 
the rank order or paired-comparison methods. 
The writers have developed the rank-compari- 
son rating method, a new method of obtaining 
a rating criterion which has proved both 
practical and valuable in many industrial re- 
searches. It is the purpose of this paper to 
describe the rank-comparison rating method 
and to present data concerning certain char- 
acteristics of ratings obtained by this method. 


The Rank-Comparison Rating Method 


The rank-comparison rating method com- 
bines features of the ranking and paired- 
comparison methods. Itinvolvesthe following 
general steps: 

1. Separation of the total group into random 
sub-groups. 

2. Ranking within sub-groups. 

3. Successive merging of sub-groups by a 
modified paired-comparison method. 

The end result is a ranking of the total group 
from best to poorest achieved without the 
laborious comparisons involved when large 
groups are handled by the straight forward 
paired-comparison method or the confusion 
that arises in trying to rank a large group. 


172 


Preliminary Preparation 


The preliminary preparation required before 
contacting the rater is as follows: 

1. Prepare a small name card for each person 
to be rated. 

2. Arrange the cards in alphabetical order. 

3. Divide the rater’s total cards at random 
into two or more sub-groups with from 15 to 20 
cards in each sub-group, keeping the groups as 
nearly equal as possible. Decide how many 
sub-groups are needed. Then take the alpha- 
betically arranged pack of cards and deal out 
the required number of groups as in dealing 
hands in a card game, dealing one card at 
a time. 

It is convenient, though not necessary, to 
divide the total group into 2, 4, or 8 sub- 
groups. This makes it possible to work with 
the same number of cards in each group at 
each stage of the merging process. 


Obtaining the Ratings 


The ratings are obtained during a conference 
with the rater. The procedure is as follows: 


Initial Judging 


1. Explain carefully and exactly to the rater 
what is to be considered in judging the persons 
to be rated. 

2. Lay out the cards for the first sub-group 
alphabetically in front of the rater in one or 
two columns depending on the size of the group. 

3. Ask the rater to choose the best person in 
the group, emphasizing again the basis on 
which the choice is to be made. 

4. Place the card of the person chosen as best 
at the top of a new column of cards, 

_ S. Ask the rater to choose the poorest person 
in the group. 

6. Place the card of the person chosen as 
poorest at the bottom of the new column of 
cards. 

7. Ask the rater to choose the best person re- 
maining in the group. Place this person’s 
card in the new column of cards under the one 
previously placed at the top of the column 

8. Ask the rater to choo: . 
remaining in the 
card in the new co 
already at the bot 


se the poorest person 
group. Place this person's 
lumn of cards above the card 
tom of the column, 


Reign H. Bittner and Edward A. Rundquist 


9. Continue this process, alternately choosing 
best and poorest persons among the cards re- 
maining in the original group and building 
the new column by placing each card selected 
as best under the last card placed at the top of 
the column, and each card selected as poorest 
above the last card placed at the bottom of the 
column. It should be obvious at this point 
that the cards are being placed in rank order 
by building from both ends toward the middle. 

10. When all cards have been transferred to 
the new column by this alternating selection 
procedure, ask the rater to check over the way 
the persons have been ranked making any ad- 
justments considered necessary. Permit per- 
sons to be moved up or down in rank if the 
rater so desires. Ordinarily, few adjustments 
will be made. 

11. Pick up the cards, keeping them in rank 
order from best of all on top to poorest of all 
on the bottom. Put this pack of cards aside 
for the present. 

12. Repeat steps 2-11 for each of the other 
sub-groups. 


Merging 


13. The next steps in the procedure involve 
the merging of the sub-groups so that the end 
result is the total group ranked in order from 
best-of-all to poorest-of-all. The merging pro- 
cedure will be described for any two sub- 
groups. If there are more than two sub- 
groups, the steps in merging are shown 1? 
Table 1 for any number of groups up to SX- 
For example, if there are three sub-group 
merge sub-groups 1 and 2, and then use the 
same procedure to merge sub-group 3 with the 
total of sub-groups 1 and 2. 2 

14. Place the cards for sub-groups 1 and 
before the rater in two stacks, each stac 
arranged in rank order from bes! person on top 
to poorest person on the bottom. The rater 15 
directed to consider the two persons -— 
names show on top of the two stacks of Won 
Ask the rater, “Which of these two is ^? 
better?" The card chosen is removed 4” 
placed upside down on the table to begin a new 
Stack of cards. 

15. Ask the rater, "Which of the two now 
showing is the better?” The card chosen P 


7 
removed and placed upside down on the ne” 
stack, 


The Rank-Comparison Rating Method 


173 


Table 1 
Showing Steps in Merging 
Groups to Merge at Each Step with Varying 
Ste Numbers of Sub-Groups * 
ti ps 
in 
Merging 2 Sub- 3 Sub- 4 Sub- 5 Sub- 6 Sub- 
Groups Groups Groups Groups , Groups 
G1 G1 G1 G1 G1 
A with with with with with 
G2 G2 G2 G2 G2 
JF G(14-2) G3 G3 G3 
B with with with with 
G3 G4 G4 G4 
N Fj 
N Gü42) | GG44 Gs 
c SY with with with 
; i s VA G(34-4) G5 G6 
Z NI 
ZA 
i! GG-4-5) | G2) 
D N hy omit 
G+ (3+4) 
"d ht NI |^ Sy 
a 4 G(14-2--34-4) 
E with 
kt G(5--6) 


* Sub-groups identified as G1, G2, G3, etc.; G(1+2) is the combined group after sub- 


groups 1 and 2 have been merged, etc. 


16. Continue the procedure, each time asking 
the rater to choose the better of the two names 
showing and placing the card chosen upside 
down on the new stack of cards. Usually, the 
question need not be asked more than twice; 
the rater quickly grasps the idea and proceeds 
through the comparisons without need of 
further prompting. When all cards have been 
transferred to the new stack, the two groups 
Will have been merged into a single combined 
group which is ranked in order from best to 
Poorest, 

17. Continue merging groups according to 
the merging steps shown in Table 1. In 
merging each pair of groups, repeat the pro- 
cedure given in 14, 15, and 16 above. 


Final Check 


18. At the conclusion of the merging process, 
the total group will have been placed in rank 
order from best to poorest. As a final check on 
the rank order, lay out the cards in rank order 
before the rater and ask him to make any ad- 
justments felt to be necessary. The rater is 
permitted to move persons up or down in rank 
if desired. Experience has shown that few 
adjustments are made, but it is desirable to 
give the rater this opportunity. 


Statistical Treatment of the Ratings 


The procedure results in rank-order ratings 
which are more amenable to statistical treat- 
ment if converted to another scale. In our 


174 


studies, it has been found most useful to con- 
vert the rank-orders to standard scores. The 
method described by Garrett (6) has been 
used in making this conversion. 


Characteristics of Rank-Comparison Ratings 


Reliability. The rank-comparison ratings 
are quite reliable in the sense that the same 
raters given the same directions will give 
essentially the same ratings even after a con- 
siderable lapse of time. For example, a fore- 
man rated 75 factory women and rerated them 
three months later. 
the two ratings was .92 (11). Another fore- 
man rated and rerated 31 factory women one 
month apart. The correlation between the 
two ratings was .89 (2). 

The importance of measuring reliability of 
ratings under the same conditions is illustrated 
in the first study cited above. In an effort to 
get raters to control their bias favoring long- 
service employees, a straight paired-comparison 
method was tried along with special efforts to 
make the rater discriminate between ability 
and length of service of the people rated. The 
correlation between ratings by the rank-com- 
parison method and the paired-comparison 
method was .26. This is a marked change 
from the .92 when the same method with the 
same directions were used. Other foremen 
changed as noted in the study, but not to the 
same degree as this particular one. 

Agreement Among Raters. Agreement among 
raters is more likely to be a function of factors 
other than the method of rating. The degree 
of knowledge each rater has of the people 
rated and the varying standards used by the 
raters in judging performance are particularly 
important factors. It is of some interest, 
however, to note the degree of agreement 
achieved when the rank-comparison method is 
used. Forty-eight factory women were rated 
by their foreman, assistant foremen and by 
nne inspectors (9). Each inspector rated from 
4 to 40 of the women. The inter-correlations 
between ratings were: foreman and assistant 
foreman -+.65; foreman and "average in- 
Spector" +.67; and assistant foreman and 

average inspector" +-.73, 

a bes study, 97 supervisors and fore- 
ated by three raters on personality 


The correlation between . 


Reign H. Billner and Edward A. Rundquist 


suitable for supervision (4). The intercorrela- 
tions between raters were: plant manager and 
personnel director +.63; plant manager and 
training director +.45; and personnel director 
and training director +.53. 

Agreement among the raters in these two 
studies is moderate to fairly high. Compar- 
able results have been found in other studies. 
This agreement can in no sense be interpreted 
as reliability of the ratings. However, if there 
were no agreement among raters who really 
knew their people and who were reasonably 
consistent in their rating standards, it would 
raise questions concerning the adequacy of 
the method. 

Relation to Paired-Comparison Ratings. A 
completely controlled experimental comparison 
of the rank-comparison method with the paired- 
comparison method is not available. Two 
studies are available from which inferences as 
to the relationship between the two methods 
can be drawn. The results of the two studies 
indicate that if all conditions are controlled 
the two methods give essentially the same 
results. f 

A study previously cited (11) involved rating 
of four groups of factory women by the rank- 
comparison method and rerating seven months 
later with a partial paired-comparison technique. 
In the rerating technique, the sub-groups set 
up in accordance with the rank-comparison 
method were ranked by a straight paired- 
comparison technique after special directions 
were given emphasizing that the rater should 
discount as much as appropriate the length 
of time persons had been on the job and rate 
solely on actual ability to perform the job. 
(These special directions were not given in the 
original rank-comparison rating.) The sub- 
groups were then merged according to the rank- 
comparison technique. The rating and Te- 
rating correlations for four raters are: Rater 
1(N=75) +.26; Rater 2 (N=82) +.90; Rater 
3 (N=64) +.84; and Rater 4(N=67) +.70- 
The correlations between the two ratings are 
high in all but one case. The low correlation 
for Rater 1 appeared on investigation to be due 
to the change in directions on evaluating length 
of time on the job rather than to the paired- 
comparison technique used in ranking the sub- 
groups. It would seem then that to the extent 


The Rank-Comparison Rating Method 


the paired-comparison technique was involved 
it did not markedly change the ratings. 

A second study (1) involved rating of 18 
factory women. The rank-comparison method 
was modified slightly as follows: sub-groups 
were ranked by their foremen and the depart- 
ment supervisor then merged the sub-groups 
in the standard way. The department super- 
visor several days later then rated the 18 
women by a straight paired-comparison tech- 
nique. The correlation between the rank- 
comparison and the paired-comparison ratings 
was .97. Obviously, there is no essential 
difference in the ratings obtained by the two 
methods. 

Relation to Rating Scale Ratings. A study 
in the selection of supervisors (4) involved the 
use of several types of criteria of personality 
suitable for supervision: rank-comparison, a 
9-trait rating scale with each trait rated on a 
5-point scale, and a 2-point (above and below 
average) rating scale on overall personality for 
Supervision. Two raters rated by rank-com- 
parison and the 2-point overall rating methods 
all of the 96 supervisors they knew well enough 
to rate. In addition, they rated the super- 
visors immediately under them on the 9-trait 
Scale. The correlations between the rank- 
Comparison and the other two types of ratings 
for each rater are: Rater 1, rank-comparison vs. 
two-point overall scale (N=92), bi-serial 
r= .93; Rater 1, rank-comparison vs. nine trait 
scale (N=19)=+.88; Rater 2, rank-com- 
Parison vs. two-point overall scale (N= 96), bi- 
Serial y=+-.75; and Rater 2, rank-comparison 
Vs. nine trait scale (N=13)=+.77. 

The rank-comparison method gives, results 
closely comparable to the other two methods 
for Rater 1 and quite comparable for Rater 2. 
Depending on the type of rating desired from 
other considerations, it is indicated that the 
tank-comparison method may be used with 
Confidence instead of rating scale methods. 

Relation to Production Criteria. Rank-com- 
Parison ability ratings and production criteria 
Should be closely related if both are valid and 
reliable measures. The usual inadequacies of 
Production criteria mentioned earlier were all 
Present in such criteria available in our studies. 

hus, a close relation between the two types of 
criteria was not expected. However, some 
€vidence has been found to indicate that as the 


175 


worker's individual control over production in- 
creases, the relationship increases between 
production measures and the ratings. In one 
study where the individual's production was 
largely machine-paced and where there was no 
opportunity for differences in type of material 
handled to cancel out, the correlation between 
the production criterion and the foreman's 
rating was .24 (10). In another situation the 
production of the individual was considerably 
more under the worker's control (9). In this 
situation where each worker could set her own 
pace, correlations between ratings and average 
pay period efficiency was .84 for the inspector's 
rating, .73 for the assistant foreman's rating, 
and .78 for the foreman's rating. In a third 
study where again production was fairly well 
controlled by the worker, the correlations be- 
tween supervisors ratings and three production 
measures were as follows: .66 with production 
speed; .50 with production quality, and .70 
with overall production efficiency (1). These 
findings suggest that the rater makes allowances 
for difficulty of job assignment, quality of 
products handled and the like in making his 
ratings. It also suggests that with proper care, 
ratings can be obtained which will be of almost 
as great value as adequately controlled produc- 
tion measures for the purpose of validating 
tests. This is an important consideration 
where adequate production records are not 
routinely kept since ratings can be obtained 
much more cheaply. 

Predictability of Rank-Comparison Ratings. 
It might be expected that ratings would be 
more predictable than production records if 
raters make allowances for the factors that 
often operate to make production records in- 
adequate criteria of the individual’s worth on 
the job. There is some evidence that this is 
the case. In the first study cited above (10) 
involving 63 factory women, a test battery 
was developed that correlated .26 with the 
production measure. However, it was possible 
to develop a test battery that correlated .47 
with the rating criterion. In the second study 
cited above (9), 37 women were tested. These 
were divided into two groups, one of 20 women 
with less then ten months service, and one of 
17 women with ten months or more of service. 
The test battery was developed on the basis of 
an average production and rating criterion. 


176 Reign H. Bittner and Edward A. Rundquist 


Table 2 


Correlations between Test Battery and Various Criteria 


Test-Criterion Correlations 


: Total 


Short Service Long Service r 
Group Group Group. 
Criterion (N = 20) (N = 17) (N = 37) 
Ave. Production and Rating 61 Ad 49 
Ave. Foreman and Ass't Foreman Rating -70 AS E 
Production Efficiency* .50 38 e 
Ave. Inspector's Rating AT 42 29 


* The average of 10 to 26 pay-period efficiency indices based on standards set by Industrial Engineering De- 


partment. 


The correlations of the test battery with the 
various criteria for the short and long service 
groups are shown in Table 2. 

The most predictable single criterion is the 
average rating of the assistant foreman and the 
foreman. Efficiencies, however, tend to be 
more predictable than the inspectors’ ratings. 
Perhaps the inspectors whose job is to detect 
violations of quality standards are overly in- 
fluenced by the knowledge of defects found and 
do not make sufficient allowances for the 
difficulty of the job or for varying lengths of 
service. They seem confused by the length 
of service factor. Instead of a correlation for 
the total group somewhere in between those for 
the short and long service groups as was found 
for the other criterion measures, the correlation 
for the inspectors is lower for the total group. 


Advantages of the Rank-Comparison Method 


Traditionally, ranking methods have been 
considered as a substitute for the paired-com- 
parison method. From the evidence presented 
above, the rank-comparison combination of the 
two methods appears to yield as satisfactory 
a result as the paired-comparison method. 
Moreover, it has four great operational ad- 
vantages: (1) it is easily understood by the 
raters; (2) raters like the method and have con- 
fidence in it; (3) it can be applied to large 
groups; and (4) it requires very little of the 
rater's time. 

The method is easily understood by the 
rater. At least 50 department heads, foremen 
and inspectors have rated their people with this 
method. Many of these raters had little ver- 


bal facility but no difficulty in comprehension 
has been encountered. In fact, the method of 
merging the ranked sub-groups was invented 
when one rater was unable to understand @ 
system whereby more than two names were 
exposed simultaneously. Once the present 
method had been devised, however, the rater 
proceeded with no further difficulty. —— 

Raters like and have confidence in the 
method. Their reaction has always been 
favorable. Not only is the method liked by 
the raters but it invokes in them a feeling 9 
confidence that the ratings are accurate m 
sures of ability. Many have commente, 
that this method should be used in giving mer! 
ratings, although the writers do not agree 
with this. i 

The method can be used with large group 
When groups of 100 or more are to be rated, the 
paired-comparison method obviously becomes 
too unwieldy. With 100 cases it would T 
volve 4,950 comparisons. Tt is also difficult » 
attempt to rank this many people. To cae 
come this difficulty, large groups are aren 
divided into random sub-groups of equal. “ae 
and then either the rank order or — 
comparison method is used within sub-grouP B 
This involves the assumption that sub-group” 
are truly random groups. The pee 
parison method gets away from such prob P ter 
and such assumptions. It does not e on. 
how large the group is; it can be ranked f" ji- 
1 to N by this method with very little ed z 
cation. The feature that makes this poss! + 
is that the rater never has to consider mot 
than 20 persons at one time. In passe 
was discovered that when more than 20 c? 


oe tS 


The Rank-Comparison Rating Method 177 


were placed before the raters, they sometimes 
became confused and were hesitant in making 
the original rankings. With 20 or fewer cases, 
no difficulty has been encountered. 

The method requires very little of the rater's 
time. While no systematic records have been 
kept, instances are known where close to a 
100 people have been ranked in less than 15 
minutes. 


Conclusion 


Ratings, with all their difficulties, are the 
most common criterion data in industrial 
personnel research. Not only are they the 
most common but they are probably the best 
under most circumstances. Production re- 
Cords, if obtained under conditions of equal 
training and control of all factors affecting 
individual performance except individual dif- 
ferences in ability, would be the ideal criterion. 
With the rare exception of some training pro- 
grams (8), such conditions are seldom met. 

The rank-comparison method of rating 
presented here is a practical and useful method 
which has certain advantages over other 
methods. It does not, however, solve all the 
problems incident to the use of ratings as 
validation criteria. Since in the majority of 
industrial situations the experimenter is re- 
duced to the use of ratings, it would seem useful 


to devote more study to methods of obtaining 
these ratings. The solution of these problems 
awaits further research. 


Received August 12, 1940. 


References 


1. Bittner, R. H. The selection of bottle decorating 
machine operators. (Unpublished study.) 

2. Bittner, R. H. The selection of bottle inspector- 
packers. (Unpublished study.) 

3. Bittner, R. H. The selection of handyman- 
inspectors. (Unpublished study.) . 

4. Bittner, R. H.. and Rundquist, E. A. Develop- 
ment of a supervisor personality test. (Unpub- 
lished study.) 

. Ferguson, L. W. The effect upon appraisal scores 
of individual differences in the ability of superiors 
to appraise subordinates. Personnel Psychol., 
1949, 2, 377-382. 

6. Garrett, H. E. Statistics in psychology and educa- 
tion. New York: Longmans, Green and Co., 
1937. Pp. 168. 

. Mahler, W. R. An experimental study of two 
methods of rating employees. Personnel, 1948, 
25, 211-220. 

8. McGehee, W. Cutting training waste. Personnel 
Psychol., 1948, 1, 331-340. 

9. Rundquist, E. A. Predicting success of glass 
selectors. (Unpublished study.) 

10. Rundquist, E. A. The selection of tumbler inspec- 
tor-packers. (Unpublished study.) 

11. Rundquist, E. A., and Bittner, R. H. Using 
ratings to validate personnel instruments: a 
study in method. Personnel Psychol., 1948, 1, 
163-183. 


t^ 


ÉL 


Validity of an Objectivity Key on a Short Industrial 
Personality Questionnaire * 


Edward R. Carr and Harold F. Rothe 


Stevenson, Jordan and Harrison, Inc., Chicago, Illinois 


Tn a previous paper, Rothe described the use 
of an Objectivity key on a short industrial 
personality questionnaire (3). The Objec- 
tivity key described there and referred to in 
this paper consists of six items patterned after 
the L scale of the MMPI (1). The Objectivity 
key was shown to permit the adjusting of 
scores on some other keys so that highly ob- 
jective persons would be compared with norms 
based upon other highly objective persons; 
persons of medium or low objectivity would 
also be compared with their appropriate groups. 
This technique is intended to minimize the 
effects of “faking” on questionnaires. 

This technique is only valuable, however, 
when the Objectivity key is valid, and when 
the other keys are valid. A valid Objectivity 
key is one that indicates which persons are 
being objective and which ones are not being 
objective while answering the questionnaire. 
A non-objective person is one who is “putting 
his best foot forward" and attempting to “look 
good" on the questionnaire. A highly objec- 
tive person is one who is extremely frank while 
answering the questionnaire, not attempting 
to hide what are apparent faults or weaknesses. 

The purpose of the present paper is to 
present some data that indicate that the 
Objectivity key used in this study does 
separate highly objective from non-objective 
respondents. 


Experimental Technique 


The industrial personality questionnaire (3) 
was administered three times to a group of 
fifty college students. Different instructions 
were given orally to the group each time. The 
first instructions were to “fake the question- 
naire" so as to “look good" for a job for which 
they were to assume they were applying. 
The second instructions, given after all siti 


* The authors wish 
3 ish to acknowledge the assist: 
Miss Judy Yackle in preparing Misa SEREA 


dents had finished the form (in about five 
minutes), were to “fake the form to look bad. 
The third instructions, given after all students 
had finished the form the second time, were 
to “be as honest as possible.” 

As far as could be observed, good rapport 
existed between administrator and students, 
and it was believed that the students were 
cooperating. The students were freshmen, 
chiefly in an engineering curriculum.! The 
technique of several administrations with dif- 
ferent instructions is generally similar to the 
technique used by Giese and Christy, reported 
in Tiffin (4) and by Longstaff (2). The results 
(means and s.d.'s) for the four personality keys 
for the three kinds of test administration are 
shown in Table 1. The actual distributions ° 
these scores are filed in ADI? 


Results on The Objectivity Key 


As Table 1 shows, the responses of the stu- 
dents on the Objectivity scale varied from 
trial to trial and in the anticipated direction” 
That is, when told to “fake to look good’ va 
students obtained very low Objectivity oes 
When they faked to make themselves am 
bad" they obtained very high Object! V! y 
scores. When they were “honest” they asd 
tained the pattern of Objectivity scores: ily 
sembling a normal curve, that is customer 
obtained in samples of industrial personne M 

The critical ratio between “good” and 
was 17.27; between "good" and “honest 
6.05; and between “bad” and “honest 


» was 

> was 
-ormick: 

! The writers wish to thank Dr. Ernest MeCorm f 

Department of Psychology, Purdue Univers! 2, of 

permitting this experiment to be conducted in 

his classes. ; 

2 Tables 2, 3, 4, and 5 have been deposited Ype 
American Documentation Institute and x neri 
tained by ordering Document. No. 2723 from iy Wash- 
Documentation Institute, 1719 N Street, N. images 1 
ington 6, D. C., remitting $.50 for microfilm [im 5) of 
inch high on standard 35 mm. motion picture "spout 


$.50 for photocopies (6 X 8 inches) readable W 
optical aid. 


" e 
| with pA 


178 


Validity of Objectivity Key on a Personality Questionnaire 


Table 1 
Means and S.D.'s of Four Personality Keys for Three 
Kinds of Test Administration 
Note N = 50 engineering college freshmen. 


Key Mean S.D. 

Objectivity: 

Look good 1.1 1:5 

Look bad 53 9 

Be honest 3.0 1.6 
Emotional Score: 

Look good 3.6 2.1 

Look bad 10.6 1.5 

Be honest 5.2 24 
Social Dominance: 

Look good 7:9 9 

Look bad 1.2 1.4 

Be honest 6.1 2.2 
Drive Scores: 

Look good 

Look bad 

Be honest 


8.68. All of these differences are significant. 
It is apparent, then, that college students can 
vary their Objectivity scores, depending upon 
the instructions given them, and presumably 
depending upon their set. Since the changes 
in score are in the direction that would be ex- 
pected, according to the theory underlying the 
Objectivity key, or the MMPI L-Scale, it may 
be concluded that the Objectivity key does 
provide a measure of the extent to which the 
respondents are “faking.” A scale with a 
greater range would, of course, provide a more 
adequate measure. 


Results on the Emotional Key 


In the previous paper it was shown that re- 
spondents with low Objectivity scores tended to 
have low scores on the so-called Emotional key. 
Respondents with high Objectivity scores had 
high Emotional scores. Accordingly the Ob- 
jectivity score was found to be useful in inter- 
preting the Emotional score (i.e, by making 
adjustments upwards or downwards in the 
Emotional score). 

The students’ Emotional scores showed the 
Same relationships (see Table 1). The stu- 
dents, when “ooking good," were being non- 
objective in order to “look good,” and showed 

OW Emotional scores. The mean score was 


179 


3.6. When the students were “looking bad," 
the mean Emotional score was 10.6. When 
they were "honest," the mean was 5.2. The 
differences between these three conditions are 
again all significant, the critical ratios being 
19.55 between “good” and “bad,” 3.51 be- 
tween "good" and “honest,” and 13.75 be- 
tween "bad" and "honest." It is, therefore, 
apparent that college students can vary their 
Emotionality scores, depending upon their in- 
structions and presumably upon their sets. 
It may also be concluded that low Emotional 
scores may be associated with low Objectivity 
scores, and high Emotional scores may be 
associated with high Objectivity scores. 

There is a possibility that there is an in- 
evitable relationship between Objectivity and 
Emotional scores and that both are measuring 
the same thing. Data to be published shortly 
rule out this possibility. Leaving that ques- 
tion for the moment, it has been shown here 
that the Objectivity key measures what it 
purports to measure and that the technique of 
interpreting Emotional scores within a frame 
of reference established by the Objectivity 
scores is a valid one. 


Results on the Social Dominance Key 


Social Dominance scores were not found 
to vary with Objectivity scores for the in- 
dustrial sample previously reported. Inter- 
estingly enough, for the college sample dis- 
cussed in this paper, Social Dominance scores 
were found to vary with the administrative 
instructions, and also with the Objectivity 
scores. 

The mean Social Dominance score, under the 
“look good” conditions, was 7.9; the mean 
when “looking bad” was 1.2; the mean when 
"being honest" was 6.1 (see Table 1). The 
critical ratios are: 27.86 between “good” and 
"bad"; 5.15 between “good” and “honest”; 
and 13.15 between “bad” and “honest”; and 
all of the differences are significant. 

These results are particularly interesting in 
view of the fact that the students, when 
“honest” give essentially the same distribution 
of Social Dominance scores as does the in- 
dustrial sample previously reported. That is, 
college students apparently conceive of “Social 
Dominance" as being a desirable set of habits 


180 ward R. Carr and Harold F. Rothe 


for jobs to which they might aspire (“look 
good”), but industrial personnel of various 
categories do not "fake" questionnaires to 
make themselves appear more highly socially 
dominant than they are. That is a highly 
interesting point deserving more research.* 


Results on the Drive Key 


The industrial sample previously reported 
did not show a variation in Drive scores with 
Objectivity scores. The college students re- 
ported here do show some relationship be- 
tween these two keys. 

The mean score on Drive when “faking 
good” was 6.1; the mean when “faking bad” 
was 4.7; and the mean when “honest” was 6.3 
(see Table 1). The critical ratios are: 5.68 
between "good" and “bad”; 0.42 between 
“good” and “honest”; and 4.73 between “bad” 
and “honest.” The difference between “good” 
and “honest” is not significant but the other 
two differences are significant. Thus, the 
college students could “fake” the Drive key 
to show that they had little drive, but they 
could not, or did not, “fake” to show more 
drive than they actually possess. 

There are two additional features about the 
college students’ not “faking” a higher drive 
than they did in order to look good. One is 
that their Drive scores, while “honest,” were 
higher than while faking to look good, although 
insignificantly so. This leads to the possibility 
that this key is too subtle to fake. Unfor- 
tunately, the writers have no data to answer 
that problem. 

The second feature is the possibility that 
the students actually had such a high drive 
they did not believe they would have to fake it, 
However, the students did not have an unu- 
sually high distribution of Drive scores, being 
substantially the same as those of the industrial 
samples previously described. 

Accordingly it is tentatively concluded that 
the Drive key is too subtle to be faked, with the 


exception of one or maybe two items that can 
be faked to show lack of drive. : 


* Neither of the 
groups of persons knew the names of 
the keys, but the Social Dominance items are fairly 


obvious; ie. “I dislike walk; : 
Poin AI ime walking across the middle of a 


Summary 


In a previous paper the use of an Objectivity 
key to establish a frame of reference for inter- 
preting certain other keys on a short industrial 
personality questionnaire was described. The 
present paper describes an experiment to estab- 
lish the validity of that Objectivity key. 
This experiment was conducted with college 
freshmen rather than with an industrial popu- 
lation. The experiment should be repeated 
with an industrial sample. : 

When instructed to “fake” the questionnaire 
to “look good" for a job for which they were 
to assume they were applying, the students 
obtained low Objectivity scores and low Emo- 
tional scores. The very low Objectivity 
scores would indicate to an interviewer that 
these persons might be “faking,” and their 
Emotional scores should be compared with 
other persons of low Objectivity scores. 

In a like manner, when the students faked to 
“look bad” for the jobs, they obtained high Ob- 
jective and high Emotional scores, and these 
could again be interrelated in interpre 

The particular questionnaire described nere 
is used by consulting psychologists as an red 
view aid. It is concluded from this study bw 
the Objectivity key is a valid key for locating 
“fakers” and for locating extremely frank e 
spondents, and hence contributes to the p 
view. It permits the use of different norms e 
the Emotional key, based on the Objectivity 
score. " 

Other findings are that college students r 
a questionnaire to make themselves a em 
socially dominant, as a desirable job charac i 
istic, and socially submissive, as an undesira t 
job characteristic. This relationship was s 
found on an industrial sample previously i 
ported. College students fake a lack of they 
as an undesirable characteristic, but ue 
cannot, or do not, fake a high drive as desir? 


Received August 18, 1049. 


” 
* Readers may wonder why faking “good pe Objec 

“poor” do not yield a consistent pattern on Í 

tivity key. The nature of the Objectivity LE ions, ° 

MMPI L-Scale, is such that opposite : ihe Objec 

faking result in shifts to opposite ends on 

tivity or L-Scales. 


Validity of Objectivity Key on a. Personality Questionnaire 181 


References 


has 1. Hathaway, S. R., and McKinley, J. C. Manual 
for the Minnesota Multiphasic Personality Incen- 
tory. New York: The Psychological Corpora- 
| tion, 1943. 


2. Longstafi, H. P. Fakability of The Strong Interest 


Blank and The Kuder Preference Record. J. 
appl. Psychol., 1948, 32, 360-369. 

3. Rothe, H. F. Use of an Objectivity Key on a Short 
Industrial Personality Questionnaire. J. appl. 
Psychol., 1950, 34, 98-101. 

4. Tiffin, J. Industrial psychology. New York: Pren- 
tice-Hall, Inc., 1947, rev. ed., pp. 170-171. 


Getting Your Message Across by Plain Talk 


Arthur O. England 
Personnel Planning Office, Air Materiel Command, Dayton, Ohio 


In 1948, the Personnel Planning Office con- 
ducted an employee attitude survey in the Air 
Materiel Command (AMC). A sampling of 
our some 80,000 civilian employees revealed 
some highly interesting facts about the effec- 
tiveness of our communications. Slightly 
more than one-third of those sampled stated 
they did not know the procedure for submitting 
a grievance! The significance of those figures 
became apparent to our top management when 
they discovered that there were at least three 
different publications dealing with the subject 
of grievances. There was an Air Force Regu- 
lation, an AMC Regulation and a Civilian 
Personnel Letter distributed to each employee. 
Since approximately 70% of the civilians em- 
ployed by the Air Force work in the Air 
Materiel Command, this lack of knowledge of 
‘personnel procedure seemed worthy of our 
attention. We began asking ourselves just 
how many other procedures and policies of 
management were unheard of or misunderstood 
by theemployees. Inasmuch as the employees 
were “informed” by our various publications, 
the difficulty might very likely lie in the 
language and style used in writing these 
directives. 


Readability of Our Publications 


Using the Flesch formula, an analysis was 
made of the readability of literally hundreds of 
Air Force Directives, Civil Service Regulations, 
Technical Orders, Maintenance Handbooks, 
employee newspapers, and so forth. Briefly. 


our study showed that the Air Materiel Com- 
mand was not 


getting its message across to 
the employees: 


1. More than 90% 


of our people f. i 
hard to read and unde d ird 


a « stand our directives, 
A - Technical Orders and Maintenance Hand- 
ooks used too many big words. Uncommon 


non-technical words hind, i 
e ] er the reade 
Brasping technical ideas. diu 


3. More than 90% of our people found it 
hard to read and understand articles in our 
civilian newspapers. : 

4. Office memoranda were written in the 
third person. They were filled with trite 
phrases. It took too long to read them. 
Also, messages in that style are hard to 
remember. " 

5. More than 60% of our airmen found it 
hard to read and understand directives ad- 
dressed to them. T" 

6. More than 20% of our Air Force officers 
found it hard to read and understand messages 
addressed to them. 


Table 1 shows the “average” reading score s 
the different publications sampled. Nine E 
ferent employee newspapers from our fie à 
installations throughout the country We" 
sampled. diss 

Obviously, to be able to relate the reading 
ease scores of printed material to any reading 
audience, it is necessary to know the port 
tional background of that audience. A aoe 
of the personnel records of our civilian. d 
ployees showed that their educational — 
very closely approximated that shown for a 
U.S. adult population. Thus, the U. S. e 
figures used by Flesch in The Art of Reada 


Table 1 


Reading Ease Survey of Different Types 
of Publications 


a 
Average iption 
Type of [ac Descrip 
Publication Ease Score E 
AMC Technical Orders 46 Dien it 
AMC Regulations 12 g T eult 
AMC Letters 28 Newt pow E 
AMC “Daily Bulletins" 25 Very pour 
Hq Office Instructions 13 Verr “iil” 
Air Force Regulations 28 Very difficul 
Air Force Letters n yery ditcul! 
Civil Service Regulations 28 very ^ 
AMC employee newspapers 45 Diffie? 


182 


Gelling Your Message Across by Plain Talk 


Table 2 


Cumulative Percentage of Airmen and of Officers 
Having Various Amounts of Education 


Per Centof Per Cent of 

Educational Level Airmen Officers 
Some grammar school 99.87 100.00 
Grammar school graduate 91.97 99.97 
Some high school 78.27 99.13 
High school graduate 43.47 95.64 
Some college 6.39 62.22 
College graduate 42 24.67 
Post graduate .07 7.13 


Writing were adapted for use in studying our 
civilian audience. 

Estimates of the educational levels of our 
military readers are shown in Table 2. 


Translating Research Findings into Action 


Convinced that the gobble-de-gook in gov- 
ernment writing was a real barrier in manage- 
ment's effort to get its message across to the 
employees, positive steps were taken to sell 
operating officials on the merits of plain talk. 
It was recognized that, without the complete 
Support of top management, our re-educational 
program stood little chance of succeeding. 
Through the support of Major General J. M. 
Bevans, Chief of the Personnel and Adminis- 
tration Department, the writer presented a 
talk before top staff officers of AMC. By 
using visual aids in the nature of a large, illus- 
trated flip chart, the failure of our gobble-de- 
Sook was shown. Top management enthusi- 
astically indorsed plain talk. 

For the next three months, lectures were 
given on plain talk. Why it should be used, 
how to use the Flesch formula, and what plain 
talk would do was discussed. All major com- 
Ponents and divisions of the headquarters 
Were covered in this “educational” campaign. 
At the same time, a work book was prepared 
Showing how to apply the reading ease formula 
With samples of re-writes of our various pub- 
lications. After each lecture, these work books 
Were distributed to the audience. Over 2,000 
top officials were covered in these lectures. 
The basic ground work was laid for getting 
Plain talk accepted by management. 

€ next step meant overcoming “the re- 


183 


sistance to change" of thousands of our lower 
level supervisors. It was decided to publish 
an official manual to assist all those individuals 
who do any writing for the Air Materiel Com- 
mand. Attempting to practice what we were 
preaching, we prepared the AMC Manual 11-1, 
entitled, Gobble-de-gook or Plain Talk? This 
manual has a reading ease score of 72; style, 
fairly easy; audience level, 6th grade. Next, 
in an effort to attract and hold readership, the 
manual was illustrated with cartoons. Bold 
headings typical of commercial advertising 
were also used on each new topic under dis- 
cussion. Further, we felt that if this technique 
of writing our message in an easy-to-read style 
were to be accepted, some convincing selling 
had to be done. For many years the federal 
government and the Armed Forces have been 
using the same trite, unimaginative, difficult- 
to-read style of writing. Overcoming that 
habit pattern would not be easy. The selling 
points used in the manual were as follows: 

1. Gobble-de-gook is costly. We had a selling 
point that is not always applicable to private 
industry. Every work published is read 
during the work day. This even applies to 
our house organs (the civilian newspapers). 
The point was made that readable language 
saves reader time. And time saved means 
money saved for management. It costs 
$24,160.00 if our employees spend only ten 
minutes of their working time reading a four- 
page directive. This isa modest estimate. Tt 
takes ten minutes to read and understand even 
one page of some of our gobble-de-gook writing. 
Obviously, if the message in the publication is 
not understood, then our national defense 
money is not being spent wisely. 

2. Plain talk saves reader time. Why not 
honor your reader's time? Think of the reams 
of paper work that flow over the reader's desk 
daily. The writers of these papers are in com- 
petition with each other for the reader's atten- 
tion. Writing stripped of all gobble-de-gook 
stands a good chance of being read first. It’s 
brief and to the point. 

3. Plain talk style pleases your readers. Long, 
wordy sentences confuse the reader. Imagine 
trying to understand the ideas in one sentence 
of 461 words. Such a sentence was found in 
one of our military publications. The reason 
“Time” and "Reader's Digest" are so popular 


184 


is because they please the reader. The average 
sentence in these magazines has only 17 words. 
They sell because people want to read them. 
You have ideas to sell, too. Do you write ina 
style that pleases your reading public? 

4. Plain talk principles help get ideas across. 
It’s easy to get into the habit of writing for a 
mythical audience. But you are writing for 
real persons to read and understand. They 
may be busy commanders and, division chiefs. 
They may be supervisors who are more inter- 
ested in getting out production than wading 
through a stack of publications. They may 
be ungraded workers who quit school at the 
seventh grade. In every case, you must decide 
who makes up your audience. Who will read 
your message? If you know the educational 
background and reading ability of your 
audience, you will be able to write so they can 
understand you. 


Examples of Using Plain Talk 

Here’s what a current Air Force directive 
says about the grievance procedure: 

“(1) An employee who has a grievance or his 
representative will normally present the griev- 
ance, in the first instance, orally to the imme- 
diate supervisor. The supervisor will consider 
it promptly and impartially, collecting the 
necessary facts and reaching a decision. If 
the employee is not satisfied with the solution 
of the problem, he will be advised that he 
may discuss the problem with the next higher 
supervisor. 

"(2) If the employee feels that an interview 
with the immediate supervisor would be un- 
satisfactory, he or his representative may, in 
the first instance, present his grievance to the 
next supervisor in line. Where an employee 
feels an interview with the second supervisor 
would likewise be unsatisfactory he may seek 
counsel from the civilian personnel officer or 
his employee relations counselor, whose role 
will be to advise and aid him in facilitating the 
employee's approach to a supervisory level 
determined apy 


r propriate by the facts in the 
particular case.” 


(Reading €ase score, 24; style, very difficult; 
audience level, college graduate) 


Here’s how that sectio 


n of the directi 
could have been written fo; * 


r the employees: 


Arthur O. England 


“Ts something about your job bothering you? 

“Here are the steps you can take to solve 
your problem. In most cases it will be solved 
at the first step. If not, you have the right 
to keep going on up to the top. You may 
present your own case or have someone do it 
for you. 

“Talk with your supervisor. He has been 
told to give a prompt and fair answer to all 
problems. Usually, a short, friendly talk 
with him will fix things up. Be honest and 
sincere when you talk with him. . 

"If you feel that your supervisor will not 
handle your case fairly, you may go directly 
to his supervisor. Or, if you have gone to 
your supervisor and he didn't handle your 
problem to suit you, you may still go to his 
supervisor. 

"If you feel your case has not yet been, Or 
will not be, handled fairly by either of them, £O 
to your personnel technician. He can't give 
you a final answer, but he can tell you how to 
get it." 


(Reading ease score, 85; style, easy; audi- 
ence level, 5th grade) 


Tips on Writing to Be Under: tood 


In addition to presenting an explan. tion mee 
examples of using the Flesch formula € 
potential reader audience, the following UPS s 
writing to be understood were presented in ou 
manual: t " 

The key question to all writing is, “Whal a 
I trying to tell whom?” : ‘oi 

If you consider the following points, te 
may be fairly certain your message wil 
understood. +. dows 

Define your audience. When you sit do to 
to write, the first thought that should come 
your mind is, “Who will read it?" Asa Tir 
material that is written for the base comman, t 
with college training is not suitable for the 
ungraded employee who quit school at sgh 
eighth grade. Writing can be simple eno 
to be read with ease and understanding “4 
poor reader and yet be interesting enous) 
hold the attention of a good reader. 

Define your purpose. Just what 
trying to say? Is your purpose to 
ployees to save their sick leave? 
explain the benefits of sick leave? Tf 


are you 
em- 


Gelling Your Message Across by Plain Talk 


not clear in your mind about what you want 
to say, it’s a sure bet your reader won't know 
either. The purpose for writing is of foremost 
importance. The reader should be able to 
understand what to do, why it must be done, 
and how to do it. 

Present your ideas in logical order. A simple 
easy flow of related ideas is necessary if your 
message is to get the effect you want. Each 
part of your message should prepare the 
reader for what is to come. Don't jump from 
one idea to another. Complete your discus- 
sion of each idea before introducing another. 
Present one idea at a time. 

Avoid unnecessary technical “niceties.” 
Don't use tine distinctions in words when they 
are not needed. Writers often spend too much 
time quibbling about technical niceties which 
have no real meaning for the reader. 

Keep the vocabulary familiar. Logic will 
not help the reader if he does not understand 
the words used. In AMC our biggest job is to 
find ways of writing about technical ideas. 
"These ideas are often complicated and some 
technical words must be used. But the words 
used to modify these technical words should be 
familiar ones. Non-technical terms that will 
Cause trouble for the reader should be omitted 
Wherever possible. A technical idea is hard 


enough to grasp without also including hard. 


non-technical words to confuse the reader. ; 
Use simple sentences. It was the fashion 

many year: to write articles with sentences 

running weli over 100 words. Today edu- 


185 


cators clearly show us that writing can be more 
easily read and remembered if the sentences 
are short. In order to do this, you should 
avoid using involved sentence structure. In- 
volved sentences are too much of a mental 
burden for most readers. Why ask your reader 
to expend mental effort trying to figure out 
what you are trying to say? Be brief and 
honor the reader by telling facts in short, 
brisk sentences. 

Use words of one or two syllables. Avoid 
words that will stop the reader. Use the good 
short ones that come first to your mind. 
Edit your paper from your reader's point of 
view before signing it. These few words found 
in current AMC directives are typical of the 
“stoppers” we mean: 
idiosyncracies 
adjudication 


recapitulation 
beleagured 


“Stoppers” add to the reader's difficulty in 
getting the message. Foreign phrases should 
be avoided. Also, don’t use short words that 
are not common. Using words of one or two 
syllables is not the whole answer. The words 
must also be understood. 

There have been many noticeable improve- 
ments in our communications since the incep- 
tion of the campaign for more plain talk in 
government writing. But like all ingrained 
habits, it will take time to unlearn the old ones 
and adopt the new ones. 


Received March 13, 1950. 
Early publication. 


Prediction of Academic Success in Three Schools of Nursing 


Albert H. Ford 


Towson, Maryland 


There are various reasons why nurse trainees 
withdraw from schools of nursing; paramount 
among them is academic failure. In a study 
by Horner (1), reported by Potts (3), approxi- 
mately 37 per cent of a group of more than 
15,000 students admitted to schools of nursing 
over a period of years were eliminated prior to 
completing their courses. Of those eliminated 
in two years’ classes, about 30 per cent with- 
drew because of academic failure. Potts (3) 
also found classroom failure the largest single 
reason for withdrawals from schools of nursing. 

Perhaps more significant than the total 
percentage of withdrawals from schools of 
nursing is the percentage of withdrawals in the 
early stages of training. Horner (1) found 
that 63 per cent of those withdrawing did so 
in the preliminary period of training; 84 per 
cent had left the school by the end of the first 
year. Potts (2) points out that in a particular 
group of 1,555, approximately 90 per cent of 
the eliminations for classroom failure came 
within the first six months of their course. 

It can be concluded from the foregoing that 
academic failure is one of the principal reasons 
for withdrawal from schools of nursing, and 
further, that failures occur early in the school's 
program. If we assume that better selection 
techniques are able to predict in advance an 
applicant's likelihood of academic success, then 
these techniques will save much of the expense 
ordinarily incurred by the withdrawal of un- 
successful trainees. 

It was for reasons such as these that the 
nurse training supervisors of three Knox- 
ville, Tennessee, training schools decided to 
solicit the aid of the University of Tennessee 
in the development of a more efficient selection 
program. 

Statement of the Problem 


Although the general problem involved was 
twofold—the determination of scholastic suc- 
cess, and also success in ward training or on- 
the-job training—this study was concerned 
only with the former phase. Its specific 


purpose was to determine the extent to which 
scholastic success in Knoxville hospital tram- 
ing programs could be predicted from a battery 


of tests administered to entering groups. 


Subjects 


The subjects used in this study were trainees 
accepted in the training programs of the East 
Tennessee Baptist Hospital, the Fort Sanders 
Hospital, and the Knoxville General Hospital. 
The original sample consisted of 187 trainees 
from six groups admitted to the schools without 
regard to the scores made on the selection tests. 
Each hospital contributed from one to Ham 
groups. With the exception of one group, a, 
trainees were tested within 30 days of ie 
acceptance into training. The other group gp 
completed from one-half to one year of 1 
particular program. 

The school entrance requirements 
that applicants must be females betwee 


are such 
n the 


University of Tennessee. 


The Variables 


- ; included 
The original battery of predictors inclu 


the following measures: 
; ies 
University Serle 


Test for Pro 
Edition» 


1. George Washington 
Reading Comprehension te 
spective Nurses; Form 1, First 
by Thelma Hunt. . v Serie 

2. George Washington University ; 
Arithmetic Test for Prospective / 
Form 1, First Edition; 1940; by 
Hunt. zs; CO 

3. American Council on Education ise 
erative General Science Test: i y- 
Series, Form X; by Paul E. kag "mary 

4. Science Research Associates Pr ur- 
Mental Abilities; 1948; by L- < ne. 
stone and Thelma Gwinn Thurst? 


op” 


d 


186 


Prediction of Academic Success in Three Schools of Nursing 


5. Kuder Preference Record; Science Re- 
search Associates; Form BB, 1942. 
6. High School Point Average. 


The criterion for the study was the point 
average of all scholastic grades earned by the 
trainee up to March, 1949, either at the 
hospital or on the University of Tennessee 
campus. If the trainee left the hospital prior 
to that time, her average was based on all 
courses completed prior to withdrawing. Each 
trainee took all of the tests; high school aver- 
ages were available for all trainees except 
one who was admitted on the basis of Vet- 
erans Administration high school proficiency 
examinations. 


Procedure 


Since the number of cases available (N= 187) 
did not justify the inclusion of all the separate 
measures, the most promising in terms of 
predictive power were sought on the basis of 
the validity coefficients computed for the first 
two groups. From these correlations it seemed 
likely that the best measures to include in the 
final battery were: Reading Comprehension 
Test; Arithmetic Test; Cooperative General 
Science; Total Primary Mental Abilities; High 
School Point Average; and the Science and 
Social Service scales of the Kuder Preference 
Record. 

The validity coefficients of the various Kuder 
scales were both small and inconsistent. The 
largest obtained was .34 between the Scientific 
Scale and the scholastic point averages of the 
group; however, the second group revealed a 
Coefficient of only .005. The Scientific and 
Social Service scales were retained in the final 
battery more on the basis of reports in the 
literature and the significant group centile 
ratings on the measures than for their validity 
Coefficients. From the validity coefficients of 
the subtests of the Primary Mental Abilities, 
it did not appear that there was any additional 
Predictive power to be gained beyond that 
Provided by the total of that measure. 

Similarly, in terms of the total sample 
available, it was considered advisable to com- 
bine the various groups from the three hospi- 
tals, thus considering them as derived from a 
homogeneous population. To ascertain if any 
differences did exist between the groups which 
Would preclude treating them collectively 


187 


rather than as separate hospital groups, an 
analysis of variance of the differences between 
groups was made on the variables to be in- 
cluded in the final battery. 

The F-ratios revealed that very significant 
differences existed on the Reading Compre- 
hension and Kuder Social Service, differences 
which were not likely to occur by chance alone 
one time in a hundred. The F-ratio found 
on the reading test was most striking and was 
the impetus for further analysis to determine 
the particular areas of greater difference. 

As was pointed out earlier, the students 
comprising one group had been in training 
from one-half to one year prior to taking the 
battery of tests, whereas the other groups 
were tested within 30 days of their admittance 
to the program. By observing the F-ratios 
depicting the within-group and between-group 
variances of all six groups, it became apparent 
that half of the total between-group variance 
was being contributed by the trained group on 
the reading test as well as on the arithmetic 
test. It seemed likely that by omitting the 
trained group, much of the variance in the 
reading test as represented by an F-ratio of 
5.31 would be eliminated. 

With the trained group eliminated, none of 
the F-ratios except the one for the scientific 
scale of the Kuder Preference Record were 
significant at the 1 per cent level of confidence 
with 4 and 141 degrees of freedom. Four of 
the measures, including the reading test, did 
not reveal differences significant at the 5 per 
cent level of confidence. The Cooperative 
General Science Test and the Social Service 
scale of the Kuder Preference Record revealed 
differences at the 5 per cent level, but not at 
the 1 per cent level. The F-ratio for the 
science test was 2.57, with a ratio of 2.44 being 
significant at the 5 per cent level with 4 and 
141 degrees of freedom. 

From the above, it was concluded that, with 
the exception of the Kuder Scientific scale, 
which was later eliminated from the battery, 
the differences which existed were not of a 
magnitude which would preclude combining 
the groups from different hospitals, thus re- 
garding them as derived from a homogeneous 
population. When we consider that all of the 
hospitals have similar academic programs, that 
their requirements for entrance are similar, 


188 Albert H. Ford 
Table 1 
Criterion Correlations and Intercorrelations between Variables and the Mean and 
Standard Deviation of Each Variable 
Note; N = 137 except in variable High School Point Average where N — 136. 

Test Read Arith Scn PMA Scn(K) S.S(K) HSPA HTPA Mean S.D. 
1. Reading Comprehension 54  .56 .56 -i Ji 38 56 629 13.4 
2. Arithmetic AO 57 —.02 04 EO 39 24.9 98 
3. A.C.E. Science 4 -42 -0 33 5; 268 94 
4. Total P.M.A. —.14 —.03 AO AS 158.5 d 
5. Scientific (Kuder) 14 —.01 00 60.4 "e 
6. Social Service (Kuder) 07 O1 102.3 ms 
7. High School Point Average E 24 5 
8. Hospital Training Pt. Average 2.0 9 


and further, that they admit applicants from 
approximately the same area, it seems reason- 
able that such influences would assure that the 
applicants were derived from a reasonably 
homogeneous population. 

Thus, after eliminating the trained group 
from the over-all sample, there remained five 
groups with a total of 146 individuals. 

The next phase of the study consisted in 
determining the criterion correlations and the 
intercorrelations of each of the final measures of 
the five untrained groups. 


Table 2 


Table of Beta Weights and Coefficients of Multiple 
Correlation Showing Successive Changes as 
Measures Are Eliminated from 


the Battery 
Test and Battery Beta Weights Multiple R 
Coop. Science 304 
Reading Comp. 242 
High Sch. Pt. Av. 286 .699 
Total P.M.A. .060 
Arithmetic 014 
Coop. Science .304 
Reading Comp. 246 
High Sch. Pt. Av. -286 .699 
Total P.M.A. 056 
Coop. Science 312 
Reading Comp. 254 
High Sch. Pt. Av. .299 .698 
Arithmetic .034 
Coop. Science .316 
Reading Comp. -269 697 
High Sch. Pt. Av. -303 i 


Table 1 gives the product moment correla- 
tions between the variables, with the mean and 
standard deviation of each variable. Both 
scales of the Preference Record yielded very 
low validity correlations; in fact, these two 
measures correlated neither with themselves 
nor with any of the other measures to an extent 
greater than .14. It was at this point that the 
Scientific and Social Service scales were 
dropped from the battery. sits 

Table 2 gives the beta weights and multip 
correlation coefficients for various combina- 
tions of the final measures, showing successive 
changes as measures with lower criterion CO- 
relations are eliminated from the battery: — , 

It was concluded from Table 2 that the a: 
practical regression equation for the predict!o i 
of scholastic point averages for students from 
the general population concerned in this stu d 
would include the reading test, the science = E 
and the high school point average; which E 
a multiple R of .697 as compared to an be 
.699 for all measures. The other mean igh 
although giving criterion correlations 25 p^ 
as .48 (Table 1), do not add sufficiently t° e 
predictive power of the battery to justify ng 
time and expense involved in administe 
and scoring the tests. 


Shrinkage of the Multiple R ; 
relatio? 


When the coefficient of multiple co ET 
is determined from a given set O da dat? 
above, and is applied to a second set 9 syen 
the yield in the latter case will be jess, com 
though the second set of data is strictly "RS 
parable. This shrinkage of the m" : 


L d 


Prediction of Academic Success in Three Schools of Nursing 189 


varies with the number of variables contained 
in the regression equation, the number of 
cases, and the size of the coefficient of correla- 
tion. A shrinkage-deduction formula has been 
devised by Smith (4) to apply to the coefficient 
of multiple correlation which provides a more 
accurate estimate of the multiple R. When this 
formula is applied to the present data a multiple 
R of .686 is forthcoming, indicating that shrink- 
age to the extent of .011 can be expected when 
the regression equation is applied to a second 
group. 

The formula for the most economical pre- 
diction of the criterion in terms of raw scores 
was: 


Hospital Training School Average 
7.017 Xrea.+.029 Xs. 4-420 Xu.s.p.a.—.87. 


Conclusions 


On the basis of the study the following con- 
clusions seem to be justified: 


l. The science test, reading test and high 
School point average were fairly effective in 
predicting success in the schools of nursing, 
the multiple correlation coefficient being .697. 

2. Although the arithmetic test and Total 
P.M.A. correlated rather well with the criter- 
ion, .392 and .479 respectively, neither of these 
tests, either separately or together, added ap- 
preciably to the predictive power of the battery. 
Increases in the multiple R were significant 
only in the third decimal place. 

3. The sub-tests of the P.M.A., in general, 
Correlated less highly with the criterion than 
did the total of that measure, as evidenced by 
validity coefficients computed for two early 
groups. 

4. Although the averages of the total group 
showed its members to be more interested in 


science and social service, as measured by 
those two scales of the Kuder Preference 
Record, than women in general, neither scale 
contributed anything to the predictive power 
of the battery. The validity coefficients were 
as follows: Scientific, .000; Social Service, .010. 

5. Significant differences between groups 
from different hospitals appear when one group 
has had one-half to one year’s training prior. 
to taking the tests and the remaining groups 
are tested within 30 days of their acceptance 
into the hospital’s training program. Such 
differences were most striking on the Reading 
Comprehension test; the F-ratio for that test 
being 5.31 with the trained group included in 
the total population. Although such an 
F-ratio indicates differences between the groups 
which are significant at the 1 per cent level of 
confidence, with the trained group eliminated, 
existing differences are not significant at the 
1 per cent or 5 per cent levels of confidence. 

6. Had the regression equation forthcoming 
from this study been used in selecting the 
trainees participating in the study, 13.1 per 
cent of them would not have been admitted 
(assuming 1.50 as the hospital critical score 
and the score one P.E. below 1.50, or 1.08 as 
the cutting score). 


Received August 25, 1949. 


References 


1. Horner, H. H. Nursing education and practice in 
New York State with suggested remedial measures. 
Albany: University of the State of New York, 
1934, pp. 38. 

2. Potts, Edith Margaret. The selection of student 
nurses. Amer. J. Nurs., 1941, 41. 

3. Potts, Edith Margaret. Use of tests in selecting 
student nurses advantageous to hospital and 
student. Hospital Mgmt., 1941. 

4. Smith, B. B. Forecasting the acreage of cotton. 
J. Amer. statist. Ass., 1925, 20, 31-47. 


Critical Requirements for Dentists * 


Ralph F. Wagner 


American Institute for Research and Department of Psychology, University of Pittsburgh 


An attempt to improve selection methods at 
the School of Dentistry led to the discovery 
that no systematic investigation had been 
carried out to determine the characteristics 
either of a successful dental student or an 
effective practicing dentist. Since this infor- 
mation is important for the development of 
personnel procedures, research was undertaken 
at the University of Pittsburgh to obtain a 
precise and practical definition of requirements 
for the profession. 

The method employed is called the critical 
incident technique. Information on the pro- 
cedure has only recently appeared in the litera- 
ture. It was developed in order that a com- 
prehensive list of behaviors of the kind which 
make the difference between success and failure 
in an activity or profession might be obtained. 
Persons in or normally associated with the 
profession and who are considered qualified to 
judge competency with respect to one or more 
phases of the job are asked to describe the most 
recent incident they observed in which a par- 
ticipant carried out a part of the job either in 
a particularly effective or ineffective manner. 
They are asked to describe the situation, the 
relevant circumstances and exactly what the 
participant was observed to do. No inter- 
pretation is requested regarding abilities, apti- 
tudes, motivations, and attitudes, which might 
have been responsible for the behavior. 

b- After a large number of incidents are ob- 
tained, an analysis is made to determine the 
specific behavior which caused the observer to 
judge an individual as effective or ineffective. 

grouping and structuring of these behaviors 
result in: (1) a list of those aspects of the job 
Which are "critical" in the sense that they 


* The study was conducted i 
no T. E b s and Dr. 
musty, and Dr. J. C. Flanagan D t f 
Psychology. Dr. W w rried out the research 
ri A Vagner, who carried out the research, 
search and Lecturer i 
! Flanagan, TB. 
W. (Ed), Current 


Pittsburgh, University of Pittsburgh 


caused an observer to make a judgment of job 
effectiveness; and (2) a series of specilic state- 
ments of the contrasting ways in which effec- 
tive and ineffective job participants behave s 
carrying out these aspects. The 1 xm 
requirements" of the occupation are v n 
derived through an analysis and a gans 
tion of these contrasting ways of carrying OU 
each aspect. 


The Present Study 


Incidents were obtained from three sources 
(1) patients; (2) dentists themselves; and d 
instructors in dental school clinics. ime e 
were not expected to supply data pude 
the technical aspects of dentistry; this ini A 
mation was to be obtained from dentists T 
instructors. As expected incidents suppi 
patients dealt more with personality, apt, 
business practices, appearance of die ee 
similar factors. Clinic instructors pior” xe 
particularly useful data for determi e, c 
quirements both for success in dental $ 
and for effectiveness in general precor. al 
instructor has an opportunity to observe € S ien 
practice being carried on by persons sition 
varying considerably in skill and is in à Paviat 
to know the important details when be : 
which is critical occurs. They corp, e 
supervisors who have, in other Minds r 
to be a source of particularly useful pa o indi- 
the case of practicing dentists there 15 A ol & 
vidual in a position comparable to ben oi 
instructor or supervisor. In most ie york 8 
the dentist who was doing the actua stances! 
acquainted with all relevant pup et 
Dentists were therefore requested «abs 
incidents concerning their own spe at gx 

In securing incidents from all three ex ner the 
patients, dentists, and instructors © d 
word “effective” or “competent” = dentist; 
referring to the desirable type ° place o 
These words were purposely used In connota 
“successful” because of the monetary ste 


: ++ was reque 
tion of the latter. Moreover, it was 


190 


Critical Requirements for Dentists 191 


that the identity of the individual whose per- 
formance had been observed not be given 
cither in effective or in ineffective incidents. 

Patients were requested to describe two 
kinds of incidents—those in which the dentist’s 
performance had caused them to recommend 
him enthusiastically to a friend and those in 
which his performance had caused them to 
change or consider changing to a new dentist. 
Dentists were also asked to describe such inci- 
dents in which it had been their own perfor- 
mance which was responsible for the patient’s 
action. In addition, however, they were 
asked for incidents in which the patient was 
unaware that particularly effective or ineffec- 
tive performance had occurred but which 
nevertheless had caused the dentist himself 
either to feel a great deal of professional satis- 
faction or to feel that he would perform more 
effectively if given a second opportunity. 
Clinic instructors were asked to describe in- 
cidents in which they had observed a student 
perform in either a particularly effective or 
ineffective manner. Particularly effective per- 
formance was defined as performance the in- 
Structor might wish to cite in the classroom, 
insist that all students copy, or the kind which 
would contribute significantly to the student’s 
effectiveness if he were in practice. Ineffective 
Performance was defined as the kind which, 
if it occurred repeatedly, or even once under 
certain circumstances, would cause the in- 
Structor to doubt seriously the student’s prob- 
able effectiveness in practice. 

A total of 781 incidents were obtained,—257 
from patients, 359 from dentists, and 165 from 
Clinic instructors. One of the most interesting 
findings of the study was the enthusiastic 
manner in which dentists participated. There 
Was some concern at first as to how willing an 
individual would be to describe ineffective 
incidents concerning his own performance. It 
Was found, however, that dentists gave such 
Incidents as fully as they gave effective inci- 
dents, The following example is typical. 

About six months ago a boy of sixteen for 
Whom I had done several fillings came in for 
!Is regular weekly appointment. I was rushed 
and did a rapid filling for him, not being careful 
about the depth of cavity preparation. No 
cement base was placed. Patient returned 
Several days later complaining of severe tooth- 

che. I removed filling and found pulp ex- 


posure. Refilled tooth with sedative cement. 
Patient did not return but a dentist friend said 
he extracted the tooth for the patient several 
weeks later. 


Another example will further illustrate the 
nature of the data collected in the study. This 
incident was obtained from a clinic instructor 
and should be of interest to dental educators. 


Student came to me with a gold inlay filling. 
He told me filling was not good and he would 
like to do the work over. On examining the 
filling in the mouth I found the filling was not 
quite as bad as the picture he had given me 
and could very possibly have been made satis- 
factory. However this student showed both 
the knowledge and was conscientiously inter- 
ested in the patient’s welfare and his own work 
to repeat the work. Most students are satis- 
fied with only fair work and do not generally 
want to repeat it to make it better. 


An analysis of the incidents from all three 
sources indicated that there were four main 
aspects in serving as a general practitioner. 
The following titles seemed descriptive of these 
aspects: I. Demonstrating Technical Profi- 
ciency; II. Handling Patient Relationships; 
III. Accepting Professional Responsibility; and 
IV. Accepting Personal Responsibility. 

The critical behaviors under each of the 
above main areas were grouped into sub-areas. 
The behaviors under Area I were grouped 
according to the specific treatment being ren- 
dered. Development of sub-areas within the 
other major areas, however, was accomplished 
on logical grounds, guided by the actual nature 
and distribution of the behaviors. Although 
the behaviors in these areas were not related 
to any specific type of treatment, it was found 
that they could be grouped into relatively dis- 
crete sub-areas. 

As a means of summarizing the content of 
the incidents, definitions of major areas and 
sub-areas were written. These definitions pro- 
vided a detailed description of the nature of 
the dentist's work and responsibilities as de- 
rived from the 781 incidents analyzed in the 
study. Asa second means of summarizing the 
content of the incidents, a tentative group of 
40 “critical requirements" were defined. They 
consist of a series of statements which express 
the specific way an outstanding dentist per- 
forms in the important situations which are 
characteristic of his profession. Some indi- 


192 Ralph F. Wagner 
Table 1 
Distribution of Critical Behaviors Among the Four Major Categories R 

Technical Patient Professional Personal. cd 

Source Proficiency Relationship Responsibility Responsibility ota 
Patients 107 168 6 94 n 
Dentists 138 160 37 95 và 
Instructors 65 36 15 59 [E 


cation -of the relative importance among the 
40 critical requirements is provided by the 
frequency of their occurrence in the incidents 
from each of the three sources.” 

Although it is impossible to present the 40 
critical requirements and their frequencies in 
the present paper, the number of critical be- 
haviors falling into each of the four major 
categories, broken down according to the 
source of the incident, is shown in Table 1. 
These frequencies exceed the number of inci- 
dents collected in the study since an incident 
often contained more than one critical behavior. 


Summary 


The present study furnished information re- 
garding the critical aspects of dental practice. 
Its purpose was not to produce a curriculum 
for training dental students. The study does, 
however, furnish information on the specific 
kinds of dental practice which have frequently 
made the difference between effectiveness and 
ineffectiveness both in practice and in dental 
School clinics. The results indicate the areas 
which most frequently cause difficulty while the 
student is in the clinic and after he has gone 
into practice and provide information on the 
type of practice which the patient, the dentist, 
and the clinic instructor consider particularly 
effective. The conclusions which these results 
suggest are as follows: 


_ 1. The requirements for effectiveness in den- 
tistry are complex. They are not confined 
alone to the demonstration of technical 


* Persons interested in a fuller description of the 


study, including the tentative statements of critical 
requirements and definitions of areas and sub-areas, 
should write to the School of Dentistry, University of 
Pittsburgh, Pittsburgh 13, Pa. Also, this information 
may be ordered as Document No. 2826 from American 
;ocumentation Institute, 1719 N Street, N.W., Wash- 
nein 6, D. G.. remitting $.50 for microfilm (images 1 
nch high on standard 35 mm. motion picture film) or 


1. i i 
Suo, [or photocopies (6 X 8 inches) readable without 


proficiency. Although the critical nature of 
this area is strongly supported, there are 
non-technical behaviors which are also criti- 
cal to effectiveness. “Handling Patient Rela- 
tionships” is a particularly important area. 
Other critical areas, to use the titles adopted 
in the present research, are “Accepting 
Professional Responsibility" and “Accepting 
Personal Responsibility.” . 

2. Many of the characteristics which have 
commonly been accepted as important v 
effectiveness in dentistry must be jede 
"Ability to converse on topics of the. day, ot 
example, has been mentioned as pene 
upon various occasions yet in the 781 inciden 
analyzed in the present study it appeared on y 
once. The same was found to be pe 
"asking question when the patient is unà : 
to answer." And, although applicants ri 
in the past been rated upon voice quality; e 
was not found to be a factor in any inciden 7 
In comparison, “discussing treatment en 
planned or rendered” occurred in 61 inciden 
In view of the complexity of requireme 
which the present study suggests, it pe 
seem advantageous to concentrate on th 
which are critical. m 

3. Critical behaviors revealed by the pe g 
study provide an additional basis for evalua in 
clinic performance. A form, develope da 
cooperation with persons experienced in ogi 
education, for recording systematically J 
served occurrences of critical behavior nt 
provide objective evidence on which to ents 
judgments of effectiveness. T hese judgm? o 
would be closely related to the requireme? ect 
actual practice. As such they would ipility 
the adequacy of training, indicate er 2 
for graduation, and provide the realistic C : 
ion which selection batteries might sttV 
predict. 


Received March 22, 1950. 
Early publication. 


The Intra-Individual Relationship Between Interest and Ability 
S. M. Wesley, Douglas Q. Corey, and Barbara M. Stewart 


University of Southern California 


An understanding of problems related to 
educational and vocational guidance has be- 
come increasingly important during the post- 
war years. Advisement of veterans, by ar- 
Tangement with the Veterans Administration, 
alone has resulted in the setting up of centers 
Which annually provide this service to many 
thousands of former servicemen. 

A question which has long been considered a 

primary concern in this field is that of the 
relationship between vocational interests and 
Vocational abilities. Summaries of studies 
Concerned with the problem are presented by 
Strong (4) whose review illustrates a trend in 
the direction of more adequate methodological 
Procedures for determining the degree of this 
relationship, Early investigations compared 
Various interests on a vocational interest test to 
Overall ability as measured by a single criterion 
Such as intelligence test scores or college grade 
Point averages. The results of such studies, 
1n general, indicated low or negligible correla- 
lions. An explanation for this seemed to lie 
in the fact that different interests were matched 
With a single general ability rather than 
with Specific abilities corresponding to those 
Interests, 
, When studies were designed in which each 
interest was matched to a corresponding 
ability, higher correlations were obtained. A 
Profitable approach (resulting in significant 
Positive correlations ranging from .32 to .40) 
Was that of Triggs (5) who compared interest 
Scores of one hundred college men on the Kuder 
Preference Record to corresponding ability 
Scores on the Jowa High School Content Ex- 
amination. However, the magnitude of the 
relationship shown by such studies continued 
9 be Surprisingly low. WT 

Since the ability scores used in Triggs' study 
Were based on deviations from thegroup means, 

e following question suggests itself: What is 

e relationship between interest and ability 
When Scores for each individual represent de- 
Vlations from his own mean, rather than from 

1€ group mean? 


193 


Some evidence that this approach might 
result in higher correlations is given in the 
study of Segel (2). He compared interests, 
as measured by the Strong Vocational Interest 
Blank, with abilities as measured by the Jowa 
High School Content Examination. However, 
his method was unique in that he correlated 
interest scores with differences between two 
ability scores, and obtained higher correlations 
than those found between an interest and an 
absolute ability score in a corresponding area. 
For example, scores on the Engineer key corre- 
lated .57 with the difference between scores in 
Mathematics and scores on a History and Social 
Science test, while the same interest correlated 
only .49 with scores in Mathematics alone. 
Although the differences between these two 
types of correlation were not statistically 
significant (results were based on one hundred 
cases), they do suggest that the use of more 
than two abilities in relation to each other 
might point the way to even higher correlations 
than those previously shown. The writers 
felt that if a truly relative score were derived 
by first calculating a mean ability score for 
each individual, a greater relationship might 
be shown to exist. 


Procedure 


In order to investigate this hypothesis a 
study (1, 3) was conducted in which tests of 
interest and ability were administered to 156 
male college students enrolled in an introduc- 
tory psychology course. Since tests were 
administered on different days the number 
taking any one test was not constant and 
ranged from 115 to 132. The measure of in- 
terest used was the Kuder Preference Record 
which yielded scores in the following areas: 
mechanical, computational, scientific, artistic, 
literary, musical and clerical! Measures of 
ability were selected to correspond to these 

‘The persuasive and social service categories were 


not included because of the lack of adequate tests of 
ability in these areas. 


194 


interest categories as follows: Survey of Me- 
chanical Insight; Stanford Arithmetic Test; 
Iowa High School Content Examination, 
Section 3, Science; The Meier Art Judgment 
Test; Iowa High School Content Examination, 
Section 1, English and Literature; Seashore 
Measures of Musical Talents, Series A; 
and Minnesota Vocational Test for Clerical 
Workers. 

Total scores were computed for all tests and 
were equated by converting to standard scores. 
Pearson r correlations were first obtained be- 
tween interest and its corresponding ability in 
each of the seven areas. This was done in 
order to provide data based on the traditional, 
or inter-individual method, to which results 
of the new, or intra-individual method might 
be compared. 

To obtain correlations based on scores repre- 
senting deviations from the individual's own 
mean, rather than that of the group, the mean 
ability score for each subject was com- 
puted (in terms of standard score units). 
The differences between his separate ability 
scores and his own mean level of ability were 
next determined. "These differences then be- 
came the measures of relative ability to be 
correlated with the corresponding measures 
of relative interest. Since the Kuder Prefer- 
ence Record is so constructed that every item is 
chosen at the expense of another, it was as- 
sumed that the resulting interest scores were 
approximately relative to the individual mean 
and that further manipulation of these data 
was therefore not necessary. Pearson r cor- 
relations between relative interest and relative 
ability, for each of the seven areas, were then 
computed. 

Since these correlations indicated the rela- 
tionship only for the group as a whole, it was 
considered desirable to investigate as well the 
variation in interest-ability agreement for indi- 
viduals within this group. Accordingly, for 
each of the one hundred students who had 
completed all tests and for whom complete data 
were therefore available, the standard scores 

for his seven areas of interest and ability were 
arranged in two rank-order sequences. Rho 
coefficients of correlation were then computed 
for each of the individual sets of scores. 


S. M. Wesley, Douglas Q. Corey, and Barbara M. Stewart 


Results 


There were thus obtained, first, seven 
interest-ability correlations based on devia- 
tions from group means, and seven correlations 
based on deviations from individual means. 
As may be seen from Table 1, these ranged 
from .07 to .47 (not corrected for attenuation) 
where group means were used, and from .23 to 
.68 where individual means were used. The 
seven “group mean" correlations were averaged 
(after transforming to Fisher z values) and 
resulted in a mean Pearson r correlation of 30. 
The average of the seven “individual mean 
correlations was .42. The / ratio, based on the 
difference between mean 3 values, was 3. 
which is significant above the 1 per cent level 
of confidence.” . 

The 100 individual rank-order correlations 
ranged from —.57 to +1.00, and (by trans 
forming to Fisher z values) a mean of + .46 was 
obtained. The difference between this mean 
and that of .30 for the “group mean" correla- 
tions also resulted in a ! ratio of 3.3, significant 
above the 1 per cent level of confidence. , 

It was thus shown that significantly higher 
correlations are obtained by the use of the w 
methods presented here than by the m 
traditional method of comparing interests E 
abilities. It is to be expected that with e 
development of better tests, and with ne 
exact matching of tests of interest and abi wi 
an overall relationship in excess of that fou 
here may be demonstrated. Although in eve 
vocational area: the correlation was MC lity 
when individual levels of interest and ee 
were used, in calculating the mean corre pe P 
substantial relationships in some areas Thi 
offset by low relationships in others. e 
variation in the correlations obtained may : F 
be, in part at least, a function of variatio! à 
the extent to which each of the ability de 
reflects experience as opposed to aptitude: n0 
would be expected that a test which i5 P tor 
heavily weighted with the experience res 
would show a higher relationship with int | 


? This difference is even more significant ber 
considered that the formulas available for t% jes 
nificance of differences between correlation’ y dom 
that the figures are obtained from independen ation 
Samples. Tn this study, of course, the t any su 
were obtained from the same sample, 5° that tive gid 
test of significance would err on the conserva 


— 


Tntra-Individual Relationship Between Interest and Ability 


Table 1 


Correlations Between Interest and Ability Based 
on Deviations from Group and from | 
Individual Means 


Group Means Individual Means 


Vocational 

Area N* r N* r 
Mechanical 131 Ad 126 .50 
Computational — 115 24 112 AT 
Scientific 126 33 126 35 
Artistic 131 .29 127 31 
Literary 125 AT 125 68 
Musical 122 21 118 .23 
Clerical 132 07 125 33 

Mean 30 42 


* y $, H 
The size of N in each case depended on the number 
of subjects for whom necessary data were available. 


because of the correlation which we know to 
exist between interest and experience. An ex- 
amination of the correlations in Table 1 shows 
that those tests in which the experience factor 
would play a larger part, such as the Iowa 
Literary Test and the Survey of Mechanical 
Insight, have a higher correlation with interests 
than do those ability tests such as the Min- 
hesota Clerical and the Seashore Tests which 
have a lower weighting with the experience 
factor. However, there still exists the reason- 
able hypothesis that there is a genuine varia- 
tion in the degree of relationship between in- 
terests and abilities for different activity or 
Vocational areas. 


The Meaning of Individual Differences 
in Interest-Ability Congruency 


The wide range in interest-ability congru- 
ency for different individuals (—.57 to +1.00) 
raises what may be an important question for 
those concerned with vocational guidance: 

Vhy for some individuals is the correlation 
high and positive, while for others it is low or 
even negative? In order to explore this area, 
age, intelligence and personality factors were 
Studied to determine whether they might be 
related to these individual differences. The 
Measures of intelligence and personality, re- 
“Pectively, were the Army Alpha Examination, 
"SL Nebraska Revision, and the Minnesota 
Wtiphasic Personality Inventory. An upper 


195 


and lower 25 per cent of the group were 
selected, based on those having highest and 
those having lowest interest-ability congru- 
ency as shown by the individual rank-order 
correlations. For these extreme groups a 
comparison was made of mean age and of mean 
intelligence scores, but the differences were 
were found to be insignificant. Similar com- 
parisons were made between mean scores ob- 
tained by the upper and lower groups for each 
of the nine categories of the Minnesota Multi- 
phasic Personality Inventory. Although only 
one significant difference (for the Schizophrenia 
scale) was found, the group having highest 
agreement between interests and abilities 
showed scores on eight of these scales above 
those of the other group, and in a direction 
away from the level of "normal" adjustment. 
The meaning of this finding is not clear, but 
it appears to be related to a greater differentia- 
tion in interests and in abilities for those 
having a less adequate personality adjustment. 
It is felt that a different test of personality, 
concerned with basic character structure rather 
than with nosological groups, might reveal 
important differences between those indi- 
viduals whose interests and abilities are in 
agreement and those where marked deviations 
are found. It is possible that the use of pro- 
jective measures of personality would demon- 
strate such a relationship and contribute to an 
increased understanding of these individual 
differences in interest-ability congruency. 


Predictability of Extreme Ranks 


In computing the rank-order correlations 
between interest and ability for the individual 
subjects in the study, a tendency was noted for 
high ranking and low ranking ability areas to 
have greater predictability in terms of interest 
test scores than those ability areas in the 
middle of the individual's range. 

Further analysis of the data was therefore 
made to determine the predictability of each 
interest rank from ability rank. A study of 
the rank-order sequence of interest and ability 
for each individual showed that for 31 per cent 
of the cases the highest interest fell in the same 
area as the highest ability. For 20 per cent of 
the cases, the second highest interest was in 
the same area as the highest ability. Con- 


196 
tinuing this procedure, the percentages became 
increasingly smaller so that for only 4 per cent 
of the cases was the seventh, or lowest, interest 
in the same area as the highest ability. Thus, 
for 51 per cent of the group, the first or second 
highest interest was in the same area as the 
highestability. Chance would have permitted 
only 29 per cent, and predictability was thus 
22 per cent better than chance. 

Similar procedures were applied in studying 
the number of cases in which the second highest 
interest fell in the same area as the second 
highest ability, and so on throughout the seven 
different ranks. It was shown that prediction 
from ability to interest for rank one and for 
rank seven was better than prediction for 
ranks two through six. This is explained in 
part, of course, by the fact that, at the ends, 
the error can extend in only one direction, while 
in the middle it can vary in two directions. 

It is probably this end-effect phenomenon 
which has led vocational counselors to the 
clinical belief that there is a higher relationship 
between interests and abilities than has been 
shown by statistical studies, where the inter- 
vening ranks must also be considered and where 
overall accuracy of prediction is lowered ac- 
cordingly. In observing generally good agree- 
ment between highest interest and highest 
ability, and between lowest interest and lowest 
ability, counselors have to this extent at least 
considered the ordinal position of both, and 
have thus actually based guidance on the rela- 
tion of scores to the individual's own mean 
rather than to that of the group. Although 
increased accuracy of prediction from extreme 
ranks may be largely a statistical artifact, the 
use of such prediction for purposes of vocational 
guidance would appear to be justified. 


Summary 


The purpose of this study was to investigate 
the relationship between vocational interests 
and abilities when the magnitude of test scores 
is relative to the individual's own level, rather 
than to the group level of interest and ability. 

The procedure and findings were as follows: 


1. The Kuder Preference Record and ability 
tests corresponding to seven of the Kuder 


Interest areas were administered to 156 male 
college students, 


S. M. Wesley, Douglas Q. Corey, and Barbara M. Slewart 


2. A Pearson r correlation was first obtained 
between each interest and its corresponding 
ability for scores based on deviations from 
group means. The mean of these seven cor- 
relations was .30. 

3. Pearson r correlations between interest 
and ability were then obtained for scores 
based on deviations from individual means. 
The mean of these seven correlations was 42 
which was shown to be significantly higher 
than the mean of .30 derived from the scores 
based on deviations from group means. 

4. Rank-order correlations between the seven 
interest areas and the seven ability areas were 
computed for one hundred individuals. The 
mean of these correlations was 46, which was 
shown to be significantly higher than the mean 
Pearson r correlation of .30 derived from scores 
based on deviations from group means. 

5. There was shown to be a wide rang 
individual differences in interest-ability con- 
gruency. Rank-order correlations ranged from 
—.57 to +1.00. For the 25 per cent of me 
group having highest and the 25 per = 
having the lowest interest-ability correlation : 
no significant difference was shown hmm 
mean age and mean intelligence. Howev jid 
the group having the highest agreement e 
show a tendency to less adequate pee ke 
adjustment in that mean scores on eight of D 
nine scales of the Minnesota Multi phase s 
sonality Inventory were higher than mean Sity 
for the group having lowest interest-@ E 
agreement. Only one of these differen 
that for Schizophrenia, was significant. 


a ctive 
It was suggested that the use of Lat in 


techniques would be a useful appro? jated 
further study of the personality factors Te" 
to this individual variation in the cong" 
between interest and ability. 


e of 


uenc 


, res 
6. For individual interest and ability ie 
arranged in rank order, it was shown t one 
diction from the extreme ability TnS 
and seven, to the corresponding í m ront 
ranks was much better than predicuos 
the intervening ranks, two through ai 
though this is largely an artifact d 
end-effect, and not a true difference in me 
of prediction, it has probably led F^ yal p05. 
counselors to consider the extreme ror n ind 
tions of interests and abilities for & gh 


| 


Iutra-Individual Relationship Between Interest and Ability 197 


vidual, and to apply these findings in their 
vocational guidance work. Such a procedure 
would seem to be justified and of value when 
applied in this manner. 


Received August 31, 1949. 


References 


1. Corey, D. Q. A comparison of two methods of 
determining the relationship between vocational 
interests and abilities. Unpublished Master's 
thesis, University of Southern California, 1947. 


2. Segel, D. Differential prediction of scholastic suc- 
cess. Sch. and Soc., 1934, 39, 91-96. 
3. Stewart, Barbara M. A study of individual varia- 
bility in the relationship between interest and 
ability. Unpublished Master’s thesis, Univer- 
sity of Southern California, 1947. 
+. Strong, E.K. Vocational interests of men and women. 
Stanford University, California: Stanford Uni- 
versity Press, 1943. 

. Triggs, Frances O. A study of the relation of Kuder 
Preference Record scores to various other meas- 
ures, Educ. psychol, Measmt., 1943, 3, 341-354. 


tn 


A Projective Test for Vocational Research and Guidance 
at the College Level * 


Robert B. Ammons 
University of Louisville 


Margaret Newman Butler 
Colorado Woman's College 


and 


Sam A. Herzig 


Universily of Denver 


In their professional work the clinical psy- 
chologist and vocational counselor are often 
called upon to relate for a given individual 
problems associated with occupation to under- 
lying personality structure. This need to re- 
late personality and vocations has been dis- 
cussed at some length by Bixler and Bixler (7), 
Darley (10), Kilby (14), and Trabue (23). As 
a practical matter, it is often necessary to 
recognize stated vocational problems as symp- 
tomatic of deeper personality imbalances, and 
to attempt to treat these more fundamental 
disturbances prior to attempting to find solu- 
tions for the vocational problems. 

Berkshire, Bugental, and Cassens (6) in a 
survey of tests used in guidance centers report 
that by far the most frequently utilized per- 
sonality instruments are of the paper and pencil 
type, the Bell Adjustment Inventory, the Min- 
nesota Multiphasic, and the Bernreuter Person- 
ality Inventory. The known clinical inade- 
quacy of these tests is such that no comment is 
necessary here. The Rorschach is reported 
as being used in only 20 per cent of the guid- 
ance centers in the sample, and the Thematic 
Apperception Test is not even listed among the 
79 tests estimated to be most frequently used. 
Although formal projective devices are appar- 
ently not widely used in guidance centers, they 
are accepted as being valid for general clinical 


use, and are sometimes used in conjunction 
with counseling (2). 


* Plates and manual (1) can be obtained fı 

Ammons. Thanks are due Professor R. A. ois a 

x m ae Nevada, Professor R. B. Winn of 
)nmou ollege, and Mrs. C. 

University of Louisville fo dem m dF thie 


; r critical i : 
article and many helpful sueses Hone. reading of this 


In his comprehensive review of projective 
techniques, Bell (4) mentions only a few at 
matic attempts to link projective methods of 1 : 
vestigating personality and vocational Lied 
Shagass (20) reports the use of word associat A 
tests in Canadian pilot selection. The RON 
schach has been used for job screening (3, be 
vestigation of personality structures associa 
with particular occupations (12, 18, 
selection of mechanical workers (17). j for 
kins (22) devotes a chapter of his manua act 
the TAT to personality diagnosis with a ; 
to work and vocation setting. ProjectiVe. d 
have not been used extensively in vocatio 
guidance for a number of reasons, amon S f 
being the training and experience a s 
interpretation, the "excessive" time T° 
ment, and the usual difficulty 1 
personality structure to vocational pr 


latter limiting factor is well ust i 
study by Kurtz (15) who found that € g not 
specially constructed scoring system basis 9 
permit a satisfactory prediction on the jlure of 
Rorschach scores of the success OF " l 
sales managers. " jui" 
A more Zins approach to vocations eres! 
ance problems has been made ie ene) 6) 
tests. Berkshire, Bugental, and € the 
report the Kuder Preference og t 
Strong vocational interest blanks, nal P 
California Test Bureau's Occupat a uenti 
terest Inventory to be the most idly cat 
used. All three tests call for PE cerpre i 
gorized answers, and scores are I rs A 
either on a strictly empirical bas d i»? 
patterns for people already engag 


198 


A Projective Test for Vocational Research and Guidance 


occupation (Strong, Kuder) or a rationalistic 
à priori basis (California). Although tests of 
this type work fairly well in practice, they 
suffer from many weaknesses: (a) the client 
does not indicate reasons for his preferences; 
(b) interpretation tends to underemphasize the 
relationships of the obtained interest patterns 
to the total personality (8, 14); (c) subtleties of 
feeling cannot be expressed in terms of the 
categorized prepared answers; (d) little per- 
tinent qualitative information can be gained by 
Observation during testing; and (e) several 
Studies (9, 13, 16, 21) have shown that 
answers to such tests can be “slanted” almost 
at will. 

Briefly, it can be generalized that present 
Projective tests do not satisfactorily relate 
personality to vocational problems, and present 
Interest inventories do not relate vocational 
Problems to personality. Berdie (5) in his 
Teview of factors related to vocational interests 
mentions several studies which report statis- 
Ucally significant relationships between paper 
and pencil personality tests and vocational 
Interest tests. Such information is interesting 
and provocative. What seems to be needed 
IS a test which combines the specificity of 
Content now found in the interest inventories 
with the flexibility and depth inherent in pro- 
Jective tests, and gives information as to the 
client’s interests, their origins, their function 
In the general personality structure, and how 
they are likely to affect behavior in the future. 


Problem 


The purpose of this study was to construct 

2 Projective test that would measure vocational 
Attitudes and interests, and at the same time 
Blve information concerning related psychologi- 
cal forces operating within the individual's 
Personality. To accomplish this, the following 
Steps were undertaken: (a) construction of 
"a wings of vocational situations which would 
Meet requirements of neutrality, vagueness, 
and disguised purpose, which nevertheless 
d evoke a significant variety of responses 
E Ated to vocations; (b) devising of a scoring 
sem to identify and objectify significant 
mee of responses; (c) ascertaining of the 
eut llity of the scoring system; and (d) 
"mating the validity of the obtained test 


199 


scores by determining their capacity to dis- 
tinguish between groups of known composition, 
and their consistency with personal data and 
results from other tests. 


Procedure 


Materials: 'The Vocational Apperception Test! 
consists of 18 line drawings, 835} inches (1). 
An individual engaged in a specific occupation 
is shown in each. Ten of these were designed 
for administration to women. To facilitate 
identification, the main figures on these 10 
plates are women. The remaining 8 plates 
were structured for men, with corresponding 
central male figures. 

The ten occupations depicted for women are: 
(a) laboratory technician, (b) dietician, (c) 
buyer, (d) nurse, (e) teacher, (f) artist, (g) 
secretary, (h) social worker, (i) mother, and (j) 
housewife. The plates for the eight men's 
occupations show: (a) teacher, (b) executive or 
office worker, (c) doctor, (d) lawyer, (e) engi- 
neer, (f) personnel or social worker, (g) sales- 
man, and (h) laboratory technician. The 
occupations shown were arbitrarily chosen 
because of ease of representation and represen- 
tativeness of occupation. 

In drawing the plates, certain rules were care- 
fully followed. Scenes dealt as specifically as 
possible with one particular occupation or 
type of occupation. Facial expressions and 
posturings were ambiguous in feelings ex- 
pressed. Line drawings were used, since they 
were felt to be less realistic than photographs, 
less structured, yet could emphasize the 
desired aspects of an occupational situation 
quite adequately. 

Before the final sets of plates were drawn, 
five preliminary plates for five occupations 
were used to test 15 college men and women 
informally. Results from this preliminary 
testing and accompanying interviewing were 
used to work out the final testing procedure. 
The test results and discussion indicated that 
occupational representations should be made 
less ambiguous and content simpler. This 
lead was followed in drawing the final set of 
18 plates. 

Subjects: 40 female and 35 male subjects (Ss) 
were selected for the two experimental groups. 


1 Henceforth to be referred to as the VAT. 


200 


Women Ss were sophomores at Colorado 
Womans’ College to whom 1946 edition Strong 
Vocational Interest Tests for Women (Re- 
vised) previously had been administered as a 
part of a general testing program for all 
students. Answers were machine scored for 
all scales and records showing only one “A” 
scale rating were segregated from those of the 
entire sophomore class. Inasmuch as a large 
proportion of Ss had “A” housewife ratings, in 
addition to other primary patterns, it was 
necessary in some cases to disregard the house- 
wife scores. However, Ss selected for the 
housewife criterion group had no “A” ratings 
except for housewife. Five women were 
randomly selected from each of 8 occupational 
categories thus set up: laboratory technician, 
dietician, buyer, nurse, teacher, artist, secre- 
tary, and housewife. 

The 35 men who participated were junior, 
senior, and graduate students at the University 
of Denver. Five advanced students majoring 
in and feeling a primary interest in each of the 
7 following fields were chosen? teaching, ac- 
counting, premedical training, engineering, 
social work, salesmanship, and art. All were 
given the 1938 Strong Vocational Interest 
Blank for Men (Revised). Answers 
machine-scored for all men’s scales. 

In the selection of the samples there was no 
direct control of age or previous occupational 
experience. 

Testing: The VAT was individually admin- 
istered, responses were recorded verbatim, and 
manner of responding noted. Es and Ss were 
matched as to sex, male testing male, and 
female, female. The following instructions 
were given: 

“The purpose of this test is to find out how 
people go about understanding human be- 
havior. You probably realize that insight 
into others' behavior helps us get along with 
them. Ordinarily, when we meet people, we 
try to ‘size them up.’ We do it all the time. 
I am going to show you some cards and your 
job will be to tell me a story about the people 
pictured on these cards. On each card, will 
you tell me how the person came to be in this 


? These fields were chosen becai i 
à use st 
were available. They eni hen 


are those represented on 
plates, except that a students and laboratory Ns 
clans were omitted, and art é 
Spe " art students added' to the 


were 


Robert B. Ammons, Margaret Newman Butler, and Sam A. Hersig 


situation, how he (she) feels about it, and what 
the future holds in store for him (her). Often, 
it will be necessary to use your imagination. 

Following completion of the VAT, male Ss 
were asked what they thought was being 
measured. At all times Es’ comments were 
guarded so as not to indicate the purpose E 
the test, to minimize conscious "slanting" © 
stories by the Ss. . 

Scoring: It has already been mentioned that 
scores were available for all Ss for all scales of 
the Strong vocational interest blanks. Th 
real need was for a method of scoring V x 
responses. It was decided to score for genera 
attitude toward a specified occupation Ler 
S’s report and picture content to decide sub- 
jectively and arbitrarily which occupation), 
reasons for entering an occupation, contlict 
areas or areas of concern, and vocational = 
personal outcomes. The following ouiin 
cover the scoring system as used by Es Bx 
other judges. Scoring consisted of categori. 
ing each response under one or more of th 
sub-headings in each main scoring area. 

A. General attitude toward an occupat A 
(indications from verbalization or judged en 
ing tone): 1. Like! 2. Like. 3. Indiffer® 
4. Dislike. 5. Ambivalence. . In 

B. Reasons for entering occupations: 1. a 
terest and enjoyment. 2. Ability. 3- Sta s 
4. Income. 5. Opporlunily. 6. Security. 


jon 


Altruism. 8. Contact with people. 9- Salir 
sion. 10. Forced into. 11. Excilemtm ce. 
curiosity. 12. Experiment. 13. Indepen eni 
14. Transfer of training. 15. Temporary ith 
ployment. 16. Training. 17. Contact Sy 
field. 18. Desire io influence others- love 
Idealism. 20. Desirable conditions 0j emp 
ment. 21. None stated. ysond 

C. Areas of conflict or concern: 1. Pe ia- 


x Š : b.2 
conflict (generalized): a. achievement; b- ? inse 


: : rd 

tion; c. aggression; d. inadequacy ae valu 
curity; e. independence; f. recogni te? rari! 
ative; 2. Home and parental conflict. >: ^ iy con 


i 
: ^ : -4 & jonat”, 

conflict. 4. Financial conflict. 5. duct P git 

flict. 6. Vocational conflict. 7. Health ev" 

8. No conflict mentioned. Suc 


À gi a 
D. Vocational and personal outcome 9) 
resu 


A Projective Test for Vocational Research and Guidance 


training. 3. Leaves field. 4. Not clearly stated. 
5. Confusion. 6. Disaster. 7. Continuous dis- 
satisfaction. 

For Women Only: 8. Marry, but continue in 
field. 9, Marry individual who has an allied 
interest in field. 10. Marry and leave field. 

A more detailed account of scoring pro- 
Cedures and criteria can be found in the 
manual for the VAT (1). 


Results 


Reliability of scores: When Es first attempted 
Scoring with only the outline of the present 
Scoring system, the percentage of rescoring 
self-agreement ranged from 66 to 75, calculated 
in terms of the number of times the stories were 
Categorized in the same way in the various 
Scoring areas. This was not felt to be satis- 
factory, so more precise criteria for differentia- 
tion were worked out. After a great deal of 
discussion of scoring problems coupled with 
repeated scorings of the same records, stand- 
ards were set up (1). 

Ten protocols were then chosen at random 
from the as yet undiscussed records and each 
of two Es scored five independently on two 
Occasions a week apart. Rescoring selí-agree- 
ment was now 86 per cent for all protocols and 
all scoring categories taken together. Agree- 
ment varied little for the various scoring as- 
pects (general attitude, areas of conflict, out- 
Comes, reasons for entering) and there was 
Practically no difference in self-agreement with 
With male and female records. Inspection of 
the records showed no significant difference be- 
tween these experienced scorers in reliabilities. 
. Since it was clearly possible for experienced 
Judges (2Es) to attain a high level of personal 
Consistency in scoring, a check on inter-person 
Consistency was made. Four graduate stu- 
dents with experience in TAT scoring’ were 
given two randomly selected protocols each 
and a set of scoring instructions. General 


* Thanks a : Ambler, Mrs. Helen 

are due Miss Janet Ambler, 

Ammons, Mr. Seymour ieee and Mrs. Ann Neel 
T their time willingly spent serving as judges. 


201 


sistency between experienced and inexperi- 
enced scorers was not as high (mean of 69 
per cent) as the personal consistency of the 
more experienced Es, it was felt to be sat- 
isfactorily high. There was a wide difference 
in the scoring agreement of these various out- 
side judges, one showing quite poor agreement 
(54 per cent) and another agreeing with the ex- 
perienced E essentially as closely (85 per cent) 
as he did with himself. Scoring agreement was 
lower for areas of conflict (57 per cent agree- 
ment) than for either general attitude (74 per 
cent) or outcomes (76 per cent). However, 
scoring aspect had less effect on agreement 
than the personal scoring proficiencies of dif- 
ferent judges.‘ 

With several hours training a high level of 
VAT scoring agreement could certainly be 
reached. Tomkins (22) points out that “. . . 
in TAT workshops it is common for the relia- 
bility of ratings to be very low at the beginning 
but to increase to respectable magnitude with 
practice.” 


Comparison of Strong Ratings 
with VAT Preferences 


Each story was scored only once, and 
only for the occupation the plate was designed 
to depict. The story was not scored in the 
few instances (less than 1 per cent) where the 
occupation was completely misinterpreted. 
Combining categories, a chi-square test of the 
independence of Strong scores and VAT 
preference ratings was made for all Ss. The 
hypothesis of no relationship could be rejected 
at the 10 per cent level of confidence for the 
women and at the 2 per cent level of confidence 
forthe men. Although a relationship was thus 
demonstrated, it is not as clear-cut as one 
might like. This may be due to the fact that 
on the whole, although a person may not have 
many interests in common with people already 
engaged in and successful in an occupational 
area, he may be well disposed toward the area 
as he understands it. Thus it might be ex- 

+ To save space and cost, Tables 1, 2, 3, 4, and 5 have 
been deposited with the American Documentation 
Institute. Order Document No. 2748 from the Ameri- 
can Documentation Institute, 1719 N Street, N.W., 
Washington 6, D. C., remitting $.50 for microfilm 
(images 1 inch high on standard 35 mm. motion picture 
film) or $.50 for photocopies (6 X 8 inches) readable 
without optical aid. 


202 


ected that the relationship between Strong 
id and VAT ratings would not be high. 


Characteristics of Responses to the VAT 


Reasons for entering the occupation were 

found difficult to score, so scoring was not done 
in this area. The first analysis made was of 
total number of conflicts by conflict area and 
occupation? for the women and men Ss Te- 
spectively. The men showed more conflicts 
than women (a mean of 2.0 per story as com- 
pared with 1.7. The men were more con- 
cerned with achievement, insecurity, voca- 
tional, and personal value conflicts; while the 
women showed more marital and affiliation 
conflicts. "The men showed little overall differ- 
ence in type of conflict from vocation to voca- 
tion, except perhaps for teaching; while women 
gave evidence of frequent conflicts associated 
with teaching, but very few in the housewife 
and mother areas. Although the data are not 
conclusive, there seems to be some evidence 
for an interaction between nature of conflict 
and specific occupation. The women showed 
more conflicts of aggression and insecurity in 
the teacher area, and more conflicts of achieve- 
ment, recognition, and vocational choice in the 
artist area. 

An analysis was made of the outcomes associ- 
ated with the various plates by the total male 
and female groups. The most frequent out- 
comes in women's stories were success, con- 
tinuing in the field without success being men- 
tioned, and marrying and leaving the field. 
Laboratory technicians, buyers, and nurses 
were pictured as most successful, while dieti- 
cians and teachers were described as leaving or 
desiring to leave the occupation. Nurses, 
artists, and social workers married, but stayed 
in the field, and laboratory technicians married 
some one with an allied interest, Finally, 
Secretaries, dieticians, and nurses were often 
pictured as marrying and leaving the field. 

The most frequent outcomes for men were 
ae Net radere it would seem that 

me kind of success. 
yers, laboratory technicians, 
the most successful; the sale: 
better position within the 


5 Tables 1, 2, a ; 
able from the A’ 5,3 aresinelad 


ê See Table 4 


Law- 
and doctors were 
sman moved to a 
same employment 


ed am : 
DI (see footnote 4). ong those avail- 


included in the ADI set (footnote 4), 


Robert B. Ammons, Margaret Newman Butler, and Sam A. Herzig 


area; and the teacher left that field of em- 
ployment. 

Ss of both sex 
teachers as successful, and even where 
were successful the success tended to be only 
mediocre. Both men and women put great 
emphasis on success, and in addition aomen 
frequently told stories about retiring odi 
economic competition to become housewlves 
and mothers. 

The final analysis was of the average num- 
bers of words and the associated standard de- 
viations in stories told by the male and female 
Ss to cards picturing similar occupation. 
Mean lengths were essentially the same (rang 
from 104 to 140 words) for the six pur 
cards,’ differing only in the case of teaching 
where men told significantly shorter puc 
(mean of 90 words). The story lengths bs d 
about the same for all occupational iine 
except for teacher for men and lab «cim 
for both men and women, which were shor 


infrequently described 
they 


Insight Regarding Purpose of VAT 


1 
After each male subject had been oat 
with the VAT, he was asked what he wae 
the purpose of the test might be. near 
concluded that only 7 of the 35 men came test: 
to understanding the purpose of em that 
From the responses given, it could be un - 
the likelihood of undesirable systematic $ 
ing to create an impression was small. 


Qualitative Evaluation 


ined 
n : , gal 

Much useful information can ee an 
clinically from qualitative observa observa" 
use of “total impressions." Certain ^... the 


$ á uris ©. 
tions worthy of mention were made S analys 
s a 
VAT testing program. From contea obtai 
of the protocols, it was possible xtent 


fairly definite idea of such things 4° be ont 95 
and accuracy of S's information ed pul 
occupation, the use to which he toward it, 
physicial equipment and his feelings > i ectly 
some of his acute personal problems es ó voc. 
associated with occupations, the pe how h 

tions in his personality dynamics a e E 

e! 


se footnO 

? Table 5 is available in the ADI set Crith lab acht! 
Comparisons are made of lab technic nurses © io 
nician, salesman with buyer, doctor Wi zm worker 
with teacher, engineer with artist, MI worker 
secretary, personnel worker with socia 


A Projective Test for Vocational Research and Guidance 


would like to handle his personal problems 
concerning vocations. It was usually easy to 
estimate the level of identification with the 
principal figure in the story, and the degree of 
personal involvement in thestory. Significant 
personal data of many kinds were obtained, 
particularly concerning traumatic experiences. 

As a rule, when presented with plates con- 
cerning their own occupational interests, Ss 
would show a marked increase in enthusiasm. 
Ten Ss well known to Es gave clearly recog- 
hizable stories. Blind matching would prob- 
ably have been perfect or nearly so with actual 
vocational sketches supplemented by per- 
sonality information. 

The following story illustrates the power of 
the technique: Male, college senior, age 24, 
Story to Plate 4 for men: He’s an attorney 
before a court. On one side you can see the 
Judge; on the other side, the jury. He's de- 
fending a guilty person. Like any lawyer, he's 
looking for any angles to prove the situation. 
He's not sure of himself, if the circle on the 
top of his head is a question mark. AmI right? 
What do you see it as? Why don't you ever 
tell me if I'm right or not? He's been pushed 
into this job and can't get away from it. It 
looks like a pretty tough job. He's not suc- 
ceeding. He wonders if he's going to change 
his job or not. He has in mind this problem, 
and is questioning himself as to what kind of 
Job he will have. He wonders if he is going to 
Succeed or not. It seems to me that he’s not 
Sure of himself. He never has been. ‘There is 
Something in back of his mind that he doesn’t 
Want to tell people about this. He would 
rather hide it than tell other people. A kind 
of non-professional job. He would rather not 


have too much responsibility. 


Discussion 


. The basic assumption of projective testing 
Is that S will interpret stimuli in a way which 
will reflect the cognitive and emotional organi- 
zation of his personality. There is a con- 
Siderable amount of evidence that the \ AT 
Provides a suitable situation for the projection 
of feelings and ideas related to S’s vocational 
Problems. Scored conflict areas were different 
Or stories about different occupations, and 
ered for different individuals. Outcomes 


203 


varied with the occupational situation about 
which a story was told and the sex of S. In 
view of the high scoring reliability these find- 
ings seem to indicate a basic validity for the 
procedure, and are supported in this by a 
qualitatively observed close correspondence be- 
tween information derived from personal ob- 
servation and that from test responses con- 
cerning personality facets and vocational prob- 
lems. The wide variability in the character- 
istics of the responses given speaks against 
the hypothesis that the content and structure 
of the pictures primarily determined the stories 
told. 

The only findings which might be in- 
terpreted as indicating a low validity were 
disagreements in interpretation and a low 
relationship between Strong interest ratings 
and VAT attitude-toward-occupation ratings. 
There were a considerable number of disagree- 
ments in interpretation of responses within the 
scoring system as set up, as evidenced by the 
failure to obtain more nearly perfect scoring 
consistency. Analysis of these disagreements 
almost always led to the conclusion that both 
interpretations were reasonable, and that they 
merely represented essentially equally valid 
but different levels of abstraction in interpre- 
tation. Thus the disagreement may be a re- 
flection on rough methods of scoring and inade- 
quate personality theory rather than the 
validity of the test. 

The discovery of anything but a low correla- 
tion between Strong ratings and VAT scores 
would be little short of amazing. Among 
other things, a high correlation would indicate 
a close relationship between interests common 
to persons working in an occupational area and 
the attitudes of a group of relatively inexperi- 
enced and uninformed persons toward that 
type of occupation. This relationship is not 
likely to obtain. With our group, and perhaps 
any group, one would expect that there would 
be a wide variety of occupations eliciting favor- 
able responses in a projective test. On the 
other hand, the Strong items are deliberately 
weighted to produce score differences between 
occupational interest areas. What is really 
needed is a thorough empirical study of per- 
sonality-vocational-interest relationships. 

The VAT is believed by the authors to be a 
much more versatile clinical instrument than 


204 Robert B. Ammons, Margaret Newman Butler, and Sam A. Herzig 


standardized paper and pencil tests of e 
tional interest or personality with pede 
only for categorical answers. Tt can jos ca 
effectively to obtain a large variety o sigan 
cant information about S, moluaig Dri ie i 
cal facts of importance, basic conflicts, nee 
press, nature of identifications with others o 
his own sex, methods of problem solution, and 
attitudes toward occupations and . possible 
reasons for them. Reading and writing are 
not called for, so it may be useful in growth 
studies and with relatively illiterate people. 


Summary 


A set of 18 platés for the projective testing of 
personality structure related to vocational 
problems on the college level was constructed, 
with 10 plates for women, and 8 for men. 
Each plate was ambiguously drawn but clearly 
represented a particular occupational area. 
Methods were developed for identifying and 
scoring general attitude toward an occupation, 
reasons for entering an occupation, general 
and occupational conflicts, and personal or 
vocational outcomes. 

The Vocational Apperception Test (VAT) 
was administered to 40 female and 35 male 
college students with primary interests in cer- 
tain occupational areas as demonstrated by 
Strong vocational interest ratings or declared 
major or both. The stories were scored and 
the scorings analyzed. It was found that 
consistency of scoring was approximately 86 
per cent for experienced scorers rescoring 
protocols after a week, and 69 per cent between 
experienced and inexperienced scorers, Indi- 
cated areas of conflict and outcomes varied 
with the sex of the subject, the occupational 
areas about which the story was told, and the 
particular subject tested. A low but statis- 
tically significant relationship was found be- 


tween ratings on the Strong scales and the VAT 
rated general attit 
Information from 


dicated that only 
guessed the purpose of the test. 

In the judgment of the authors, the above 
findings indicate a satisfactory reliability and 
validity for the test, Its flexible form, and its 
emphasis on depth information recommend it 


ude toward an occupation. 
a follow-up interview in- 
7 of the 35 male subjects 


; onal 
for use in the clinical exploration of pesoni 
vocational difficulties and in attacks on a 
variety of significant research problems. 


Received August 26, 1949. 


References 


" orig 
1, Ammons, R. B., Butler, Margaret N., peers 

S. A. The Vocational Apperception Test, P 

and manual. Louisville, Ky: R. B. Amr 

1949, LA 
2. Bailey, H. W., Gilbert, W. M., and Perg s 

Counseling and the use of tests in e Titinois. 

personnel bureau at the Univers 

Educ. psychol. Measmt., 1946, 6, 37-01 "un Ror- 
3. Balinsky, B. The multiple-choice gro P cants 

schach test as a means of seeeniog ME 

for jobs. J. Psychol., 1945, 19, 203-7 ui ape 
4. Bell, J. E. Projective techniques; a dynu y 


] Jew York: 
proach to the study of the personality. New 
Longmans, Green, 1948. «val inter- 

5. Berdie, R. F. Factors related to vocationa 


ests. Psychol. Bull., 1944, 41, iari, pP 
6. Berkshire, J. R., Bugental, J. F. T., C m i guid- 

and Edgerton, H. A. Test preferences 

ance centers. Occupations, 1948, 26, 
7. Bixler, R. H., and Bixler, Virginia H. 

pretation in vocational counseling. 

chol. Measmt., 1946, 6, 145-155. | interests 9 
8. Bordin, E. S. A theory of vocation i Measntlo 

dynamic phenomena. Educ. psychol- 

1943, 3, 49-66. . the Kude" 

9. Cross, O. H. A study of faking on sist 1948; 
Preference Record. Amer. Psycholog!» 

3, 293. , 

10. Darley, J. G. Clinical aspects and i 
tlie Strong Vocational Interest Blank. 
Psychological Corporation, 1941. i! 

11. Frank, L. K.  Projective methods foro 
personality. J. Psychol., 1939, 8, erson 

12. Kaback, Goldie Ruth. Vocational rou? 
an application of the Rorschach $ 92. . (. 
Teach. Coll. Contr. Educ., 1946, "Miles C d 

13. Kelly, E. L., Terman, L. M, and * 0", B. 
Ability to influence one's SEIL Chara 
pencil-and-paper test of personat 
& Pers., 1936, 4, 206-215. dio 

14. Kilby, R.W. Some vocational couns 
1948. To be published. f the Rors! 

15. Kurtz, A. K. A research test O 1, 41-51. 
test. Personnel Psychol., 1948, Strong Jn 

16. Longstaff, H. P. Fakability of the ce Record: 
Blank and the Kuder spo polt? 
appl. Psychol., 1948, 32, 360-36: g, EI. 

17. Piotrowski, Z., Candee, B., Balin ach ° 
berg, S., and Von Arnold, B. salé 
in the selection of outstanding m jt 
workers. J. Psychol., 1944, ! ' tists- P^ 

18. Prados, M. Rorschach studies on & 


Test inter 


Educ. PSY” 


tion of 
pr me york: 


alitiess 
cthod 


g method” 
chach 


est 
J 


A Projective Test for Vocational Research dnd Guidance 


I. Quantitative analysis. Rorschach Res. Exch., 
1944, 8, 178-183. 

19. Roe, Anne. A Rorschach study of a group of 
scientists and technicians. J. consult. Psychol., 
1946, 10, 317-327. 

20. Shagass, C. Word association tests for pilot selec- 
tion. Bull. Canad. Psychol. Ass., 1945, 5, 81-82. 

21. Steinmetz, H. C. Measuring ability to fake occu- 


205 


pational interest. J. appl. Psychol., 1932, 16, 
123-130. 

22. Tomkins, S. S. 
the theory and technique of interpretation. 
York: Grune & Stratton, 1947. 

23. Trabue, M. R. The role of the psychologist in 
vocational guidance. J. clin. Psychol., 1945, 1, 
182-185. 


The Thematic Apperceplion Test; 
New 


Preferred Rate and Extent of the Frequency Vibrato * 


John F. Corso and Don Lewis 
State University of Iowa 


Information has been available for several 
years on the rates and extents of the frequency 
vibrato! in artistic vocal and instrumental per- 
formance. Extensive studies were made by 
Seashore and his associates on the physical 
characteristics of vibrato tones. Further, 
various individuals have expressed opinions 
concerning the desired rate and extent of the 
frequency vibrato. 

Seashore (6), dealing with the expression of 
emotion in violin music, reports that "the most 
beautiful effect is obtained when the pitch 
oscillation does not exceed one-fourth of a 
tone." In 1929, Stanley (10) stated that “A 
proper vibrato must be absolutely regular, 
must have the correct frequency—about six 
a second. . . ." Seashore and Metfessel (7) 
maintain that in vocal performance “An ampli- 
tude of approximately a half tone interval in 
pitch is a good vibrato.” In a study on the 
control and refinement of the vocal vibrato, 
Wagner (14) selected the rates and extents of 
recognized artists as the criteria for pleasing 
vibratos. These opinions, however, were not 
supported by experimental evidence, except the 
evidence that certain artistic vibratos had 
rates and extents of given magnitudes as 
determined by the physical analysis of the 
stimulus tones. 

Up to the present time, there has been a 
general lack of information on audience prefer- 
ences for vibrato tones. Such information 
should have considerable practical value, both 
to musical performers and to designers of 
electronic musical instruments. 


Purpose of the Investigation 


The present study was undertaken to dis- 
cover listener preferences for vibrato tones. 


* This investigation, which was i 
s c concerned with th 
generation and psychological analysis of synthetic 
music, was part of a research program financed by a. 
dits op the Research Corporation. ; 
e vibrato may be defined as a musical i 
bra defi embellish- 
4 pow. i a apt rise and fall in the seed 
one (11). is relatively periodic fre 
aa is Ly accompanied by synoni 
ons in y rlodi ions i 
pus intensity and periodic alterations in wave 


Specifically, the purposes were, first, to deter- 
mine for musically untrained individuals the 
preferred combination of rate? and extent? in 
the frequency vibrato of a complex tone at 
each of five octave levels in the equal-tempered 
musical scale; and, second, to discover the 
manner in which vibrato preferences varied 
with the subjects’ musical training or native 
musical ability. 


Procedure 


Apparatus. As it was highly desirable tO 


employ a musically acceptable auditory stimu- 
lus, a cascade of multivibrators (12) wes em 
ployed as the tone generating unit. WO 
specific features of this electronic unit smile 
particularly adaptable to the epum 
situation: (a) the multivibrators produce 18" 
monically rich wave forms, commonly m 
ing the three-hundredth harmonic in as he 
put (13), and (b) the multivibrators possess er 
property of easily synchronizing with a i 
voltage having n times its frequency, W sl 
is any whole number. Through the k 
action of the set of multivibrators, Y 
possible to produce a descending series 9 The 
tones having a frequency ratio REA ctave 
frequencies of these tones, separated ara 1480 
intervals, were 92.5, 185, 370, 740, a: pond 
cycles per second. These values COT gs, 
to the octave level designations of dh the 
FA, Ff; and F*; respectively; at Wo [nas 
preference measurements were mac m 
much as the complex tones from me kn 
vibrators were excessively rich in harm side" 
was found necessary to eliminate : means e 
able number of upper harmonics PY j 
filters. perime?” | 
The primary component of the ae coni? 
apparatus was a frequency vibra 


| y tes 
unit. In this unit, the vibrato 1% ation 


? The rate of a vibrato is the number at Pecond-. ie 
the number of frequency modulation? Pirs pape? and p 

3 The extent of a vibrato, as used in ation X 
width or total range of the frequency mS [fr rg 
limited in meaning to the physical measu On 
in the acoustic wave. Vibrato extent 5 in the 
expressed as a fraction of a whole step 
tempered musical scale. 


206 


x 
o 


Preferred Rate and Extent of the Frequency Vibrato 207 


controlled by the low frequencies produced by 
à pentode phase shift oscillator which imparted 
sine wave oscillation to the multivibrator tone. 
The rates of oscillation employed were 5.5, 6.0, 
6.5, and 7.0 pulsations per second.* The extent 
of the frequency modulation of the multivi- 
brator tones was controlled by a reactance 
tube circuit and voltage dividing network con- 
nected in parallel with the tuned circuit of the 
master oscillator. The extents of modulation 
employed were 0, 0.10, 0.25, and 0.40 of a 
musical step. 

A system of lever action switches, operated 
by the experimenter, made it possible to com- 
bine each of the four vibrato rates with each 
of the four extents. In this manner, by com- 
bining each rate with each extent, twelve 
vibrato tones and one “straight” tone were 
Produced. These tones were then presented 
by the method of paired comparisons, each 
tone being paired with every other. Regular 
and inverse orders of presentation were 
employed, 

An electromechanical timing device was 
arranged to present each of the two stimulus 
tones of every pair over a high fidelity loud- 
Speaker for a period of 1.7 seconds. The two 
tones of each pair were separated by an interval 
of 1.0 Seconds. Five seconds elapsed between 
Pairs. At the end of each series of ten judg- 
ments, there was a pause of 23.8 seconds. Ap- 
Proximately midway through the experimental 
Session, after 80 pairs had been presented, a 
five minute rest period was provided during 
Which the subjects were permitted to leave the 
listening studio. : 

Subjects, The subjects employed were di- 
Vided into two main categories: (1) individuals 
With little or no formal musical training as 

*termined by answers on a personal question- 
naire, and (2) individuals with sufficient 
Musical training and ability to be members of 
the University of Iowa symphony orchestra 
OF chorus, In all, 385 subjects were used. Of 


* The r: i hat were used 
ates and extents of vibrato tha e 
Hate selected on the basis of results obtained in a pre- 


not Jing four rates with four extents in this case did 


combi 5 t ht? 
(tha ned with any rate resulted in a “straight” tone 


'S, a tone without vibrato). 


these, 331 were non-trained individuals and 54 
were trained musicians. 

The non-trained group was further sub- 
divided as follows: (1) Group A, 93 subjects, 
served at octave level Fë» for the first hour, 
F*; the second hour, and took both forms of the 
Seashore time test at the third hour; (2) Group 
B, 144 subjects, served at octave level F*; for 
the first hour, F*; the second hour, and took 
both forms of the Seashore pitch test at the 
third hour; Group C, 94 subjects, served at 
octave level F#4 at each of the first two hours 
and was administered both forms of the Sea- 
shore timbre test at the third hour. In all 
groups, one week elapsed between each of the 
three experimental sessions. 

The trained musicians served only for a 
single hour at octave level Fê, (370 cycles per 
second). None of the tests from the Seashore 
battery was administered to this group. 

Prior to the beginning of each laboratory 
period in which vibrato preference judgments 
were to be made, the following instructions 
were read: 


"This is an experiment dealing with the 
properties of the musical vibrato. You will be 
presented with a series of pairs of tones, each 
member of every pair possessing a different 
vibrato or no vibrato. You are asked to select 
the member of each pair which you like best. 
If you like the first tone of a pair better than 
the second, write the number 1 on your record 
sheet; if you like the second tone of a pair 
better than the first, write the number 2 on 
your record sheet. Be sure to make a choice 
on every pair; that is, write down for each pair 
either the number 1 or 2. Are there any 
questions?" 


The musically untrained group consisted of 
experimentally naive persons, all students in an 
elementary course in psychology. The trained 
musicians were advanced undergraduate and 
graduate students majoring in the area of 
music. Approximately 25 subjects were used 
at each experimental session. In an attempt 
to minimize any possible effects resulting from 
the order of presentation of the pairs of stimulus 
tones, four different random orders were used 
at each of the five octave levels tested. 


Results 


Although the method of paired comparisons 
was used to obtain the desired data, the mathe- 


208 


matical analysis did not involve the computa- 
tion of scale values in accordance with tradi- 
tional solutions. Instead, rank order scales of 
vibrato preferences were developed on the 
basis of the total frequency of preferred judg- 
ments for each vibrato tone. Inasmuch as the 
main purpose of the study was to develop a 
series of rank order scales from which the 
preferred vibrato rates and extents could be 
determined, it was felt that further refinement 
through a paired comparisons solution was 
not needed. 

At each of the five octave levels tested, a 
rank order scale of vibrato preference was 
constructed for the non-trained group on the 
basis of the total number of preferred judg- 
ments made to each vibrato combination at 


that level. The number of preferred judg- 
2300 
LEGEND 

2200 FF, ----- 4:93 

F#3 == N44 
2100 FÉ, Weg 
3) Fc eere N=93 

Fe —— Wa 


1900 


1800 


FREQUENCY OF PREFERRED JUDGWERTS 
a 
3 
& 


RATES = 5,5 


VIBRATO EXTENTS (H TENTHS 0 


- Fic. k Curves of vibrato preferences f. 
FE, Fë & i 

voe iplis Fé, F*,, and F*; were displaced upward b 
ade adjustment for differences in number of judges.) 


Ter „temp! 
9r non-musicians at five octave levels of the equal te 


John F. Corso and Don Lewis 


ments was then plotted against the vibrato 
combination of rate and extent to obtain a 
series of vibrato preference curves. . 

Figure 1 is a consolidated graph which 
shows the effect of increasing vibrato exten! on the 
frequency of preferred judgments at each of 
the five octave levels, when rate is held constant. 
Since the number of subjects was not equal for 
all groups, the curves for F*», F#4, and F*; were 
displaced upward by multiplying the fre- 
quencies by 1.5. This was to facilitate à 
direct comparison. 

Three features of the graph in Figure 1 are 
of greatest interest: " 
(1) The curves are very consistent for a 
scale positions, with only two slight inversions 
occurring at a rate of 6.5 pulsations per — . 
'This consistency was further indicated by 


F A STEP (RATES HELD CONSTANT) d PA 
ered ĉirpi 
15 


od z s 
y multiplying the obtained frequence 


Preferred Rate and Extent of the Frequency Vibrato 


LEGEND 


FH, - TINE TEST Grove 


HIGHEST THIRD ———— 
Lowest THIRD 


600 Ff, - TIMBRE TEST GROUP 
HIGHEST THIRD 8:33 
e LOWEST THIRO z 
i 
E 
5 
5 
2 
E. 
E 
600 FAS - PITCH TEST GROUP 
HIGHEST THIRD N=50 e 
700F Lowest THIRD ------- w-so 


225 


E 


209 


25 .40 n 


VIBRATO EXTENTS IN TENTHS OF A STEP (RATES HELD CONSTANT) 


Fic. 2. Curves of vibrato preferences at three octave levels of the equal-tempered scale for groups 
differing in scores obtained on the Seashore tests. 


rank order correlations computed between the 
Scale values obtained at the two different 
Octave levels for each group of untrained 
Subjects, The correlation coefficients were .90 
for Group A (Fy and Fés), .91 for Group B (Fé 
and Fs), and .93 for Group C (F*; and retested 
at F"). Since these coefficients were found to 
be Statistically significant, they tended to sup- 
Port the impression obtained from the graphical 
analysis that vibrato preferences remained con- 
Stant over a wide portion of the musical range. 
: (2) With the two exceptions mentioned 
above, all of the curves indicate a maximum 
Preference for an extent of 0.25 of a step, re- 
Sardless of the octave tested. The two in- 
be "Sions occurred at octave levels F*s and F*s 
sli “re an extent of 0.10 appears to have been 

Bhtly Preferred when the rate was 6.5 
Pulsations per second. 


(3) The curves for rates of 6.0, 6.5, and 7.0 
pulsations per second all reach approximately 
the same height for a given octave, indicating 
that these rates were all equally preferred and 
that extent was the primary determiner of 
vibrato preference. 

In addition to the preference scales obtained 
at each level for the untrained group as a whole, 
a series of rank order scales was constructed for 
several subgroups. These subgroups were ob- 
tained by dividing each of the three larger 
groups approximately into thirds on the basis 
of scores made on the Seashore tests. Pref- 
erence scales were then secured for the highest 
and lowest thirds of each group. Curves de- 
rived from the groups taking the pitch, time, 
and timbre tests are presented in Figure 2. It 
is apparent from the similarity of these curves 
that the ability to discriminate time, pitch, and 


210 


timbre, as measured by the Seashore tests, was 
of little consequence in vibrato preference. 

A rank order scale was developed for the 
trained musicians at octave level F#4. The re- 
sulting curves, together with those for the 
untrained group at the same octave level, are 
presented in Figure 3. Here the curves for 
the trained musicians were displaced upward 
by multiplying the frequencies of preferred 
judgments by 1.7. This took account of differ- 
ences in number of judges, and facilitates a 
direct graphical comparison. The curves for 
musicians show a maximum preference for an 
extent of 0.10 of a musical step, with the rates 
of 6.0 and 6.5 pulsations per second equally 
preferred, while the non-musicians prefer the 
same rates but a wider extent (0.25 of a 
musical step). 

The chi-square test was applied to determine 
whether or not the differences between the 
vibrato preferences of trained and untrained 
individuals shown in Figure 3 could be at- 
tributed to chance factors in sampling. A 
specific example will be given to illustrate the 
manner in which these statistical tests were 
made. The most preferred vibrato tone from 


1500 

1400 3 
1300 
1200 
1100 
1000 
900 
800 


700 


FREQUENCY OF PREFERRED JUDGMENTS 


600 


500 


400 


0 .10 .25 .40 


Fic. 3. Curves of vibrato 


s preferences for trained ici on | FFs: 
Ee mu: = a el 
trained musicians w sicians and non-musicians at octave lev 


VIBRATO EXTENTS IN TENTHS OF A STEP. CRATES HELD CONSTANT) 


John F. Corso and Don Lewis 


the rank order scale for musicians had a rate 
of 6.5 pulsations per second and an extent of 
0.10 step. The most preferred vibrato. tone 
from the rank order scale for non-music1ans 
had a rate of 6.0 pulsations per second and an 
extent of 0.25 step. When these two tones 
were presented to the 54 trained musicians for 
judgment in a single paired comparison, 36 
preferred the tone with a 6.5 rate and 0.10 step 
extent while 18 preferred the other tone. For 
the same pair of tones as judged by the 94 non- 
musicians, only 41 preferred the tone with the 
6.5 rate and 0.10 step extent. These fre- 
quencies were used in a two by two table a 
the chi-square value obtained was significan 
beyond the one percent level of confidence. 
The hypothesis was rejected that the observet 
differences were due to fluctuations of random 
sampling alone. Other chi-square tests a 
single paired comparisons were made m i 
similar manner. It was concluded that the? 
was a significant difference between the a 
preferences of musically trained and untram E 
individuals. Furthermore, the difference he 
pended upon a different preference in exten 
vibrato, not in rate. 


LEGEND 
NON-MUSICIANS: ————— =94 
TRAINED MUSICIANS — — — N=54 


.25 
0 40 7.25 / .40 


É 
(cuv £ 


ere displaced upward by multiplying the obtained frequencies bY 1.7.) 


T 


Preferred Rate and Extent of the Frequency Vibrato 211 


Discussion 


The findings reported in this study of vibrato 
preferences are generally in agreement with the 
data of previous investigations on the physical 
characteristics of the vibrato in artistic musical 
performance. The most preferred rate and 
extent, as judged by the untrained group of 
Subjects, are similar to the tentative norms 
established for artistic violin performance by 
Several investigators (2, 5, 9). These authors 
report the average rate of the violin frequency 
vibrato to be 6.5 pulsations per second, and the 
average extent, 0.25 of a tone. Other studies 
(3, 8) have shown that the typical rates for 
Professional violinists and vocalists are the 
Same, 6.5 pulsations per second, although the 
Violinist’s vibrato is only half as wide as the 
Singer's, 

Ramsdell (4), employing trained musicians 
as observers, determined the critical values of 
Tate and extent for maximal richness and for 
Singleness of pitch in a 500 cycle pure tone 
with a frequency vibrato. The instructions 
Were given, at one time, to increase the rate of 
Modulation until a tone of apparently unitary 
Pitch was achieved, such as would be satis- 
factory ina single voice. At another time, the 
Subjects were asked to vary the extent of 
Modulation until maximal richness was ob- 
tained. The results indicate that the richest, 
‘most unitary” note occurred at a rate of 6.5 
pulsations per second and an extent of approxi- 
mately a semitone. Although the explanation 
of all the effects of frequency modulation can- 
not be given at the present time, the experi- 
mental evidence seems to support the notion 
that the close agreement between performed 
and preferred vibrato rates and extents is 
dependent upon factors other than those of 
earning alone, 


Summary 


. 1. The purposes of the present study were, 
first, to determine for musically untrained indi- 
equals the preferred combination of rate and 
on in the frequency vibrato of a complex 
nd at each of five octave levels in the equal- 
Mpered scale, and, second, to discover the 
st ad in which vibrato preferences aro 
abili the subjects’ musical training or e 
m; 'ty as represented by scores on the perior 
ance tests, 


2. The complex tone employed at each of 
five octave levels was generated by a multi- 
vibrator unit. This tone was then elec- 
tronically modulated at rates of 5.5, 6.0, 6.5, 
and 7.0 pulsations per second. The extents of 
the frequency modulation were 0, 0.10, 0.25, 
and 0.40 of a whole musical step. The vibrato 
tones were presented to groups of subjects for 
judgment by the method of paired comparisons. 

3. Rank order scales of vibrato preference 
were obtained for two main groups: (1) un- 
trained individuals, and (2) trained musicians. 
At each octave level tested, preference scales 
were also obtained for the untrained subjects 
on the basis of their ability to discriminate 
pitch, time, and timbre as determined by tests 
from the Seashore battery. 

4. A comparison of the preference scales for 
untrained individuals indicated that (a) an 
extent of 0.25 step was preferred over a wide 
portion of the equal-tempered scale, (b) rates 
of 6.0, 6.5, and 7.0 pulsations per second were 
about equally preferred over the same range, 
(c) native auditory ability had little effect on 
vibrato preference, and (d) scale values for the 
retest group showed a high reliability of vibrato 
preference judgments. 

5. The trained musicians tended to prefer the 
same rates of 6.0 and 6.5 pulsations per second 
as did non-musicians, but the musicians 
favored a narrower extent (0.10 of a step). 


Received August 30, 1949. 


References 


1. Corso, J. F. Preferred rate and extent of the fre- 
quency vibrato. Unpublished Master’s Thesis, 
State Univ. of Iowa, 1948. 

2. Hollinshead, M. T. A study of the vibrato in 
artistic violin playing. Univ. Ia. Stud. Psy- 
chol. Music, 1932, 1, 281-288. 

3. Metfessel, M. The vibrato in artistic voices. 
Univ. Ia. Stud. Psychol., 1932, 1, 14-117. 

4. Ramsdell, D. H. The psychophysics of frequency 
modulation. Unpublished thesis, Harvard Univ., 
1935. 

5. Reger, S. N. The string instrument vibrato. 
Univ. Ia. Stud. Psychol. Music, 1932, 1, 305-340, 

6. Seashore, C. E. Phonophotography in the meas- 
urement of the expression of emotion in music 
and speech. Sci. Mon., 1927, 24, 463-471. 

7. Seashore, C. E., and Metfessel, M. Deviation 
from the regular as an art principle. Proc. Nat. 
Acad. Sci., 1925, 2, No. 9, 538-542. 


212 John F. Corso and Don Lewis 


8. Seashore, H. G. An objective analysis of artistic 
singing. Univ. Ia. Stud. Psychol. Music, 1935, 
4, 12-157. 

9. Small, A. M. An objective analysis of artistic 
violin performance. Univ. Ja. Stud. Psychol. 
Music, 1936, 4, 172-229. 

10. Stanley, D. The science of voice. New York: 
Fischer, 1929. 


11. Stevens, S. S., and Davis, H. D. Hearing: ils psy- 
chology and physiology. New York: Wiley, 1938. 

12. Terman, F. E. Radio engineers handbook. New 
York: McGraw-Hill, 1943. 

13. Terman, F. E. Measurements in radio engineering. 
New York: McGraw-Hill, 1935. 

14, Wagner, A.H. Remedial and artistic development 
of the vibrato. Univ. Ia. Stud. Psychol. Music, 
1932, 1, 166-212. 


Ls 


- 


\ 


Book Reviews 


Chapanis, A., Garner, W. R., and Morgan, 
G T A bblied experimental psychology. 
Human factors in engineering design. New 
York: John Wiley and Sons, Inc. 1949. Pp: 
xi+434. $4.50. 


„Serious students of man and his work en- 
vironment will find this book not only pro- 
Vocative but a major contribution in terms of 
method and technique for the study of conjoint 
problems, Management personnel, produc- 
tion engineers, industrial consultants, and psy- 
Chologists will find clear concise statements 
regarding the use and application of psychologi- 
cal Principles to practical problems. The tech- 
niques for the solution of many problems which 
have confounded these groups are clearly set 
orth, 

The book is far from perfect, as the authors 
Would be the first to state. Most of the data 
and examples are taken from research spon- 
Sored by the military services with only isolated 
stances from industrial situations. This re- 
flects the failure of industry to recognize the 
Ihportance of such research for their own 
Problems of production and operation. The 
reader is struck with the wide gaps in the body 
of human engineering knowledge now available. 

his book, however, will in all probability 
Stimulate extensive psychological and engineer- 
Ng studies to fill these gaps at the earliest 
Possible moment, 

The authors have brought to bear on the 
Problem of the interactions of man, equipment, 
and his job the research findings of experi- 
men tal Psychologists, industrial engineers, phys- 
lologists, and anthropologists. They have 
Presented this knowledge under four broad 
Categories, 


T fa The effective design of visual displays. 
1S includes a presentation of recent findings 

9n the Size, shape, and legibility of letters, 

numbers, and scale graduations. Evidence is 

dignented on how best to arrange and group 
Splays, 

tion The effectiveness of auditory communica- 
v of information, Techniques for increasing 

Se intelligibility and the recognition of 


tonal signals are presented together with ex- 
amples of applications. 

3. The effective design of operational con- 
trols. This section deals with the optimal 
control sizes, shapes, types of movement, gear 
ratios and the resistance of controls. 

+. The effective arrangement of individual 
work places and grouping equipment. This 
section goes beyond the work of time and 
motion engineers and concerns itself with the 
interactions between man and man, man and 
machine, and machine and machine. These 
interactions are referred to as “links” and quan- 
titative measurements are developed which 
permit units to be linked into an integrated 
system in accordance with the psychological 
and physical requirements of personnel. 


One of the most important sections of the 
book is the short chapter on the use of statistics 
in the analysis of errors. Essentially the 
authors have applied the methods of analysis 
of variance and of errors of measurement which 
are widely used in the theory of test construc- 
tion. The power of these tools and their wide 
application to other problems is immediately 
apparent to the reader. 

The value of this book lies in the examples 
of how the methodology of experimental psy- 
chology and statistics can be applied to the 
solution of a wide variety of problems involving 
men, equipment and the job they have to per- 
form. In the opinion of the reviewer, the 
writers have rendered a real service not only 
to psychologists, engineers, and management 
but also to the man on the job, by summarizing 
the present status of knowledge in human 
engineering and indicating how this knowledge 
can be applied in practical situations. 

Jack W. Dunlap 
Dunlap and Associates, Inc., 
New York, N. Y. 


Pease, Katharine. Machine computation of 
elementary statistics. New York: Chartwell 
House, Inc., 1949. Pp. 239. $2.75. 


“This manual is for students learning to use 
computing machines in connection with courses 


213 


214 


in elementary statistical methods. It is set up 
to be self-teaching, so that the student, y 
following the procedures in sequence, may 
learn to use the machines with a minimum 
amount of help from the instructor.” With 
this introduction to the preface of her manual, 
the author has proceeded to outline, in careful 
detail, the standard calculating machine pro- 
cedures to be used in addition, subtraction, 
multiplication, division, and extraction of 
square roots, and in obtaining the mean, stand- 
ard deviation, product-moment correlation 
coefficient, and percentile and standard scores. 
A separate set of step-by-step procedures in 
calculation (including appropriate checks for 
errors) is provided for each of the commonly 
used models of Friden, Marchant, and Monroe 
calculating machines. 

Practice is also given in complementary 
numbers, accumulative and subtractive multi- 
plication, multiplication by a constant, and 
reciprocals, and in the use of tables of products, 
reciprocals, quotients, squares, and square 
roots. Standard forms for computing sheets, 
and a list of 26 references round out the man- 
ual’s offerings. 

The result is an understandable and highly 
usable manual which should facilitate the 
learning of calculating machine procedures 
either by the student formally enrolled in an 
elementary statistics course, or by the indi- 
vidual who wishes to learn these techniques 
on his own. Since it limits itself to the me- 
chanical aspects of computation, it comple- 
ments, rather than supplants, the more con- 
ventional elementary statistics text, 


Kenneth E. Clark 


University of Minnesota 


Boynton, Paul W.  Selectin 
New York: Harper and B 
136. $2.00. 


g the new employee. 
rothers, 1949, Pp. 


This is a practical, down-to-earth review of 
the major principles of employment selection 
written in a simple, readable, non-technical 
manner. it is particularly well suited to use 
bi B interested in establishing an em- 
ployment department i i 
Fia den nt or by beginners in em. 


I ork who are seeki i i 
in this field. "9 See 


The author places justifiable stress on the 


Book Reviews 


obligation of the employer to his paca n 
and emphasizes the importance of properas ihe 
tion and placement. He next Hare T 
qualifications of the employment man. and 
is followed by an outline of sources E S 
methods of recruitment, with particular a 
ence to college recruiting. Next follows a d 
cussion of the functioning of the empoy 
department, which is probably one of peter 
parts of the book. In a manner which poses 
the author's long years of experience 1 ^ a fü 
cal employment work, he brings out i d 
portance of building acceptance for " usd 
ployment department and indicates many ean 
in which this can be done. The balance uim 
book is devoted to an exposition of meri : 
ing techniques, a brief discussion of T 
short treatment of induction and lows! Na 
Probably one of the most valuable ps s 
tions of the book is the author's | psy- 
realistic appraisal of the contribution 
chological tests to employment work. 
If one were disposed to criticize th "n 
would be chiefly on two counts: the fir at 
it contributes little that is new or uma 
is simply a popularly and clearly writ ve) em 
book of sound (but far from all-inclus written 
ployment procedures. It is dissi" ractica 
by a man who has had a wealth 7 a we 
employment experience but 1S un with t 
acquainted with the literature nor l feld- 
more sophisticated developments 1 


Robert N. McMurD 


Robert N. McMurry and Company, 
Chicago, Illinois 


. ; Rut 
Bennett, George K., and Cruikshank, kt 

A summary of clerical tests- perd pp. ” 

Psychological Corporation, 193v: 

122. Paper, $1.25. to 


[4 
This booklet is “an attempt een inf 
gether in a single publication Per ecting on 
mation regarding tests used 1” n compa” nd 
upgrading clerical workers"; it he manu? dis 
to an earlier booklet dealing Wit? are tW? ont 
mechanical ability tests. There devel OP jes 
tinct parts: a discussion of and ? a 
and use of tests of clerical abi ric? 
of brief descriptions of specific aof de. gn 
An historical sketch of the er m 
occupations and survey of dev 


m 


Book Reviews 


clerical testing from 1912 until World War II 
are followed by a review of the types of items 
used in clerical tests, material which could be 
Very useful to persons seeking ideas for clerical 
test construction. 

Two chapters are devoted to job descriptions 
of clerical occupations and to reviews of studies 
in the selection of workers for these occupa- 
tions: these are helpful as an overview of what 
has been tried, but they impress this reviewer 
as being insufficiently analytical and integra- 
tive. That some attempt was made along 
these lines is illustrated by the authors’ noting 
of the fact that Hay’s Pennsylvania Co. norms 
for the Minnesota Clerical Test are very 
similar to those compiled in the USES work 
of Stead, Shartle, and others. But in general 
the criteria of success are not carefully ex- 
amined (e.g., on page 26, Thorndike’s use of 
earnings at ages 20-22, when beginning workers 
have not yet had a chance to prove their worth 
and to earn accordingly) and sometimes they 
are not even described (e.g. Oberheim’s 
"library performance" criterion on page 24): 
a serious defect in view of the recent emphasis 
9n the need to validate the criterion. Some- 
times results which might mystify or even disil- 
lusion the unsophisticated are not discussed, as 
in the case of the perfectly logical negative cor- 
Telations between Otis scores and salary cited 
ìn Table V, p. 28. 

Chapter VI is a useful discussion of who 
Should be tested, when, and by whom: content 
often omitted in the more academic treatises, 
but especially necessary in those read by 
aymen, 

The test summaries which take up 45 of the 

3 Dages of text give publication and pur 
Chasing data excepting cost (wise in these in- 
flationary days), administration and scoring 
methods and time, type of content, reliability, 
Validity, norms, and some evaluative com- 
ments. These descriptions range in length 
from one-half page to about one and one-half 
Pages for each of 32 tests. The descriptive 
material is more adequate than the evaluative, 
for while the authors have given as their ob- 
Jective the inclusion of “a maximum of objec- 

Ive information regarding . . . reliability and 
m this information ranges from Eu 
ie 55 reported," through ‘Correlation coeth- 
“ent of .63 between speed scores and super- 


215 


visor’s ratings,” to somewhat more detailed 
data not exceeding seven lines. 

The analytical comments sometimes do 
something to remedy this paucity of detailed 
objective evidence in the test summaries, as 
when it is stated “The manual does not define 
the superior group as to number of cases or 
number of steps in the rating scale” (p. 99), or 
when the authors say (of one of Bennett’s 
clerical tests), “It needs further study to show 
its usefulness in differential placement work” 
(p. 88), or “This is another among the recently 
published tests (in this case not Bennett’s) 
which include some normative data but little 
except face validity or item types known to be 
valid to indicate its areas of usefulness" (p. 99). 


"But these comments are brief, and often non- 


evaluative. For example, concerning one test 
which undeservedly (according to the authors? 
own criteria) receives a full page of description, 
the authors write simply that it is an American 
revision of an English test, that the author 
points out that good secretaries do well on the 
whole test whereas routine office workers do 
poorly on certain subtests, and that scores 
increase with schooling and are probably re- 
lated to intelligence. They do not state that 
the American validation of this test is ex- 
tremely limited (only 3 of 5 references have 
any such data) and that the mentioned lack of 
occupational norms makes it useless in voca- 
tional counseling or selection. 

Thirty-two generally available tests are de- 
scribed in the manner just discussed; 8 tests, 
the use of which is restricted, and another 34 
tests which are no longer available or virtually 
unused, are also described more briefly. While 
it is admittedly difficult to categorize tests in 
this manner to the satisfaction of all readers, 
this reviewer is inclined to believe that Bennett 
and Cruikshank would have rendered a greater 
service to personnel psychology if they had 
been more analytical and evaluative in the 
discussion of each test, or if they had applied 
more rigorous standards in deciding which 
tests should be treated. One test publisher 
has told the reviewer in conversation that the 
mere mention of a test in a book on tests in- 
creases the sale of that test: if this is so, then 
a number of tests which have already been tried 
and found wanting are likely to receive a new 
lease on life from this publication. It might 


216 


have been a greater service to personnel psy- 
chology and vocational guidance if Bennett 
and Cruikshank had given more space to the 
tests which the evidence shows to be most 
worth using, and had relegated to a small-type 
appendix all of the less promising tests which 
needed to be discussed for historical reasons or 
for completeness of coverage. 

This raises the question of the readers for 
whom the authors were writing, of the intended 
and probable use of the booklet. Personnel 
psychologists will find this summary a time- 
saving survey of the field, a very valuable 
source of leads to studies they should be 
familiar with, and a helpful reminder of what 
is available in the way of tests; they will also 
find that the test summaries do not enable them 
to make final judgments about tests, but that 
they tell enough to indicate what is worth 
looking into more intensively both in the 
literature and in their own research. Voca- 
lional counselors will also find the survey and 
the leads helpful, but will need to go to Buros’ 
yearbook and to the more intensive treatises of 
specific tests found in some texts. Personnel 
managers and other executives, who may have 
had a little training in measurement but who 
are not psychologists, are likely to be confused 
by the large number of briefly described and 
sketchily evaluated tests. As this last is one 
of the largest probable consumer groups, and 
also one which is likely to make least use of the 
literature to which this booklet could lead 
them, this limitation becomes even more im- 
portant. Bennett and Cruikshank are not the 
first authors who have, because of the very 
nature of their field, written for too hetero- 
geneous an audience; but it would be inter- 
esting to see what they would produce, if they 
wrote heo versions of the booklet, one for 
technicians and counselors highly trained in 
testing, and one for personnel workers and 
counselors with at most a course or two in 
measurement. 

In closing, the reviewer, whose eye tends to 
be somewhat jaundiced when reading test 
discussions by test authors and publishers 
(timeo Danaos, et dona ferentes), would like to 
point out that Bennett and Cruikshank have 
conscientiously treated their own tests as 


Book Reviews 


objectively as those of other authors and 
publishers. . 
Donald E. Super 


Teachers College, 
Columbia University 


Weitzman, Ellis, and McNamara, Walter 
J. Constructing classroom examinalions—A 
guide for teachers. Chicago: Science Re- 
search Associates, 1949. Pp. xvi+153. 
$3.00. 


Of all the guide books designed to assist 
teachers in building effective classroom €x- 
aminations this is perhaps the most elementary: 
Beginning with validity and reliability and 
ending with the statistical analysis of test 
scores, it covers in simple, non-technical Jan- 
guage the customary topics, briefly and supe! 
ficially. Its greatest value will accrue to those 
teachers or prospective teachers who know 
nothing about objective test construction an 
who desire to learn but little. í 

Those who read this book for the purpose 9 
becoming informed about important advances 
in achievement test construction during en 
past twenty years will be disappointed. ed 
lidity and reliability are defined and metho 
of measuring them given but the reasons why 
they are important and the factors which in 
fluence them are neglected. Instead of p 
phasizing the place of course objectives 1n pn 
process of item construction and sampling, ". 
find recommended a topical outline of subje " 
matter with textbook page references. NO y 
structions are given regarding ways of won 
izing content and objectives to facilitate 4 : 
construction. Many of the test items use é i 
examples are excellent but there is à pP 
erance of the more factual type. The teac A 
made answer sheet and perforated scoring ved 
is not treated, but the consumable type oO afe 
with panel scoring key is. Eight pue n 
devoted to a cumbersome procedure of E the 
analysis designed to yield a measure 4 ote 
difficulty of the items. One page !5 wen 
to a simple index of discriminating P the 
without emphasizing its importance OF s 
factors which influence it. 


gan- 


Percentile a 

if acie: 
are not computed from the middreque. 
the scores. The student learns to © 


Book Reviews 


means and standard deviations but standard 
Scores are not mentioned. Certainly, better 
manuals on the construction and use of achieve- 
ment tests have been available for many years. 


Walter W. Cook 


University of Minnesota 


Cavan, R. S., Burgess, E. W., Havighurst, R. 
J » and Goldhamer, H. Personal adjustment 
in old age. Chicago: Science Research As- 
Sociates, 1949, Pp. xiii4-204. $2.95. 


The distinctive contribution of this book to 

applied psychology lies in its detailed account 
of the development, testing and application 
of an Attitude Inventory and an Activity In- 
ventory for the study of persons past sixty 
years of age. The aim of these inventories is 
tö Secure data on activities and attitudes in 
Various areas including health, family, friends, 
Work and economic security. 
, In testing the validity of these inventories, 
Interesting auxiliary schedules were developed: 
a check-list of personal characteristics which an 
Interviewer might observe, a set of word por- 
traits, and a list of symptoms supposed to indi- 
Cate senility (Appendixes D, E, F). 

More than 8,000 schedules were mailed out. 

ore than half of them went to retired teachers, 
retired ministers and widows of ministers. 

ore than a quarter of them went to sociology 
Professors who distributed them through their 
Students. Usable schedules returned by mail 
numbered 2743, and these were supplemented 
1 1 245 schedules obtained by interview (pp. 
-172). 
of e 30 (p. 134), based on the entire group 
:988 responses to the Attitude Inventory, 
d the correlations of partial scores (indi- 
wi nS degrees of adjustment in different areas) 
lth one another and with total scores. Atti- 
Pe toward leisure showed highest correlation 
ith the total score in both men and women 
Mie dis 70); happiness, feeling useful, and 
ud action in work came next; religious atti- 
*5 had the lowest correlations (.35 and -29). 


217 


On the position of religion the authors make 
the following comment: “This is not sur- 
prising, for almost a third of the subjects were 
ministers or their wives, and they probably 
show relatively little variation in religious atti- 
tudes while they show a wide variation in 
other attitudes" (p. 133). 

Tables 7 to 17 (pp. 48 to 59) are based on a 
smaller “study group" of 499 men and 759 
women (pp. 46-48 and Appendix C). The 
relation between the "study group" and the 
total group is not made clear in the text. 
Replying to a query from this reviewer Dr. 
Havighurst states: "The ‘study group’ con- 
sisted of all the respondents except the two 
major occupational groups, namely the retired 
teachers and the retired ministers and their 
wives. Thus, the ‘study group’ consists of the 
people who are described on pages 170 and 171 
under paragraphs (d) and (e), and the groups 
described in the first few paragraphs on page 
171." 

Trends with age found in the “study group" 
include (p. 60): Increased feeling of economic 
security in spite of lowered amount of income. 
Increase in religious activities and dependence 
on religion. Decrease in feelings of happiness, 
usefulness, zest, and a corresponding increase 
of lack of interest in life. 

Sex differences include the following (p. 61): 
Women feel somewhat more secure economi- 
cally than men. Women report more physical 
handicaps, more illness, more nervous and 
neurotic symptoms, and more accidents; they 
feel less satisfaction with their health than do 
men. Women have more religious activities 
and more favorable attitudes toward religion 
than do men. Women are less happy than 
men. 

The authors have made an exceptionally 
valuable contribution by their thorough, 
cautious, and critical development of the two 
inventories and by using them in a study of a 


large number of cases. 
Albert R. Chandler 


Ohio State University 


New Books, Monographs, and Pamphlets 


i aterson, Editor, 
Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Pate; p 
gue Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


The individual and his religion. Gordon W. 
Allport. New York: The Macmillan Co., 
1950. Pp.147. $2.50. . 

The supervisors management guide. Louis 
Baldwin et al. New York: American Man- 
agement Association, 1950. Pp. 200. $3.50. 

Know yourself. A workbook for those who 
stutter. Revised edition. Bryng Bryngel- 
son, Myfanwy E. Chapman, and Orvetta K. 
Hensen. Minneapolis: Burgess Publishing 
Co., 1950. Pp. 159. $2.00. 

Making work human. Glen U. Cleeton. Yel- 
low Springs, Ohio: Antioch Press, 1949. 
Pp. 326. $3.75. 

Occupational therapy. William R. Dunton, Jr. 
and Sidney Licht, Editors. Springfield, Ill.: 
Charles C Thomas, Publisher, 1950. Pp. 
350. $6.00. 

A handbook of employment interviewing. John 
M.Fraser. London: Macdonald and Evans, 
1950. Pp.212. 8/6d. 

Theory and practice of psychological testing. 
Frank S. Freeman. New York: Henry Holt 
and Co., 1950. Pp. 518. $3.50. 

Fields of psychology. Second Edition. J. P. 
Guilford, Editor. New York: D. Van Nos- 
trand Co., Inc., 1950. Pp. 779. $5.00. 

A handbook of applied psychology. Douglas H. 
Fryer and Edwin R. Henry, Editors. New 
York: Rinehart and Co., 1950. Two Vol- 
umes, pp. 826. $12.50. 

Counseling adolescents. Shirley A. Hamrin and 
Blanche B. Paulson. Chicago: Science Re- 
search Associates, 1950. $3.50. 

Learning and instruction. National Society 
for the Study of Education, Forty-Ninth 
Yearbook, Part I. Nelson B. Henry, Editor. 


Chicago: University of Chicago Press, 1950. 

Pp. 352. $2.75. 

The education of exceptional children. National 
Society for the Study of Education, Forty- 

Ninth Yearbook, Part IT. N 


elson B. Henry, 


Editor. Chicago: University of Chicago 
Press, 1950. Pp. 400. $2.75. " 
Situational factors in leadership. pm E 
Hemphill. Columbus: Bureau of EDO 
tional Research, Ohio State University; 4 
. Pp. 135. $3.00, cloth; $2.50, paper, | sth 
Child development. Second edition. eR 
B.Hurlock. New York: McGraw-Hil 
Co., Inc, 1950. Pp. 669. $L30. — |. 
How to be happy though young. George Tae 
ton. New York: The Vanguard Press, 
1949. Pp.300. $3.00. —. 
The science of chance. Horace C. Lv ons 
New York: Rinchart and Co., Ine» 
Pp. 348. $2.00. = 
The meaning of anxicly. Rollo E 
York: The Ronald Press Co, 199" 
376. $4.50. adows: 
The culture of industrial man. pain 50. 
Lincoln: University of Nebraska Press» 
Pp: 216. $3.75. Re nold 
Job evaluation. John A. Patton ae y eit 
S. Smith, Jr. Chicago: Richard . 
Inc. 1950. Pp. 338. $4.50. ork: 
The envelope. James S. Plant. so, PP. 299. 
The Commonwealth Fund, 1950. 


"TX LO 
icm medic, | 


New bí k 


Introduction to psychosomatic ation? 
e iie era” 

Alberto Seguin. New York: E^ $507 

Universities Press, 1950. PP- obert » 

How to make achievement tests. ress) 


Travers. New York: The Odyss" f 
1950. Pp. 180. $225. ^ , |, BA 
Human relations in modern etr a 
Tredgold. New York: Interne 2.50. 
versities Press, 1950. Pp- S em rese 
The development of a test for selec f 
personnel. Manpower Branch, 5 ese? 
sources Division, Office of Na for Rese?” 
Pittsburgh: American Institute 
1950. Pp. 33. 


218 


jet 


Journal of Applied Psychology 


Vor. 34, No. 4 


AuGUsT, 1950 


The Adequacy of Employee Selection Reports 


Margaret Hubbard Jones 
University of California at Los Angeles 


An examination of well over 2,100 references 
from the obtainable world literature on em- 
Ployee selection has permitted an analysis of 
Practices in both experimental design and re- 
Port, the results of which are perhaps surprising. 
These references cover the period 1906 to 1948, 
Within the ability of the author to locate the 
references and within the ability of reference 
librarians to locate copies of unusual and for- 
egn periodicals and monographs. There are 
further limitations to the data to be analyzed 
here which were imposed by the primary 
Purpose of the literature search. That aim was 
the compilation of abstracts of employee selec- 
tion reports which should contain the actual 
data presented, together with sufficient infor- 
mation to enable the reader to evaluate the 
Study without referring to the original report. 

he Work was prompted by the difficulty ex- 
Fenced in this particular field in locating 

ne widely scattered references (we had refer- 
ence to more than 300 separately-titled peri- 
odicals—and many volumes of each one—as 
el as books and monographs) and by the 
act that most industrial psychologists do not 
"m € the time or facilities necessary to review 

"Is literature, These abstracts appear else- 
Where 2). 

Dus to the large volume of material in this 
ed and to the fact that many articles which 
ti A Important contain virtually no informa- 

on, it did not seem economic to abstract all 
possible references, and the survey is thus 

ited to those studies which can be evaluated: 
ww In which relatively complete validation 
used are presented, together with specific tests 
Were. .N and job studied. Further, since we 
ing A o oed in selection of employees for 
Speci rial concerns we have also excluded the 
ial fields of selection for the armed forces 


219 


and pilot training as posing special problems. 
It has by now become abundantly obvious that 
seemingly slight changes in working conditions, 
incentives and parent population, to mention 
the more obvious factors, may result in the 
failure of even a carefully executed selection 
program. In view of this, it seemed wiser to 
exclude those studies in which the criterion 
was school grades or teachers’ ratings, and 
those carried out on military personnel even 
where the jobs are similar to civilian jobs, be- 
cause of the differences in motivation and 
working conditions. After we have eliminated 
the reports which are, by our definition, special 
problems and those which are so inadequately 
presented or executed that they cannot even be 
evaluated by the reader (a large proportion of 
the total) there remain 427 reports, or 20% of 
the total number of references. In this analysis 
we shall be concerned entirely with these 427 
reports which can be evaluated. Since many 
articles report results on diverse jobs, very 
often with different tests and different statis- 
tical procedures, we have at times referred to 
the number of separately-treated groups rather 
than to the number of titles and have endeay- 
ored to make the distinction clear wherever the 
former occurs. 

These 427 references are largely American— 
about 80%—both because of the greater avail- 
ability of privately published and unpublished 
material of American origin and because of the 
larger total volume of articles. It does not ap- 
pear that the percentage of acceptable articles 
is much larger for American work than for that 
of any other country. 

The volume of acceptable articles is shown 
by year in Figure 1. The slow rise to a peak 
after World War I, followed by a decline 
through the depression era is not unexpected. 


220 


Whether this can be entirely attributed to over- 
selling of testing, as is usually done, or whether 
it does not also reflect to some extent the gen- 
eral decline in business activity is a debatable 
point. The annual volume of articles reached 
a high again in 1941, as business was recover- 
ing, fell off, understandably, during the war 
years, and has now reached an all-time high. 
Whether it will remain high probably depends 
partly on the quality of current work and 
partly on the general level of industrial activity. 
The ten jobs which have been most fre- 
quently studied are as follows: Salesmen, 75; 
Clerical Workers, 60; Teachers, 49; Assemblers, 
23; Executives, 23; Inspectors, 23; Supervisors, 
21; Typists, 17; Stenographers 14; and Ma- 
chinists, 9. This order does not necessarily 
reflect either the importance of the job or the 
difficulty of selecting good workers. Among 
the jobs which appear to have been acceptably 
reported but once are brick-layers, grocers, 
scientists and deans of women. As can be 
seen, salesmen lead the list. This, of course, 
includes salesmen of all sorts and many of the 
jobs are quite different, The same comment 
applies to the other categories. There is 


30 
25 
20 
15 
u 
= 
3 
o 
> 
10 
5 


1910 1915 1920 1925 


YEAR 
Fic. 1. 


Number of employee selection reports by year. 


Margarel Hubbard Jones 


usually no way of determining from the pub- 
lished reports whether the jobs in two studies 
are comparable. As Ghiselli has shown, there 
is an astounding range of reported validity 
coefficients for the same general type of >. 
in any broad occupational classification VA 
For clerical occupations he found the mu is 
be over .90. There are many reasons for ee 
state of affairs but one which has perhaps e 
been sufficiently emphasized is the lack s 
adequate job description in published po^ 5 
It is now fairly generally recognized cod 
selection program is very much. situa e 
bound but the corollary, that this red 
precise job description, is not curren 

racticed. . 2 
P Let us now examine in more detail pe 
reports which represent the cream of the pei 
The number of subjects used in the inves the 
tion is an important factor in determining 
predictive value of the results. Except; eth 
majority of cases using less than 20 su V hole 
the N in and of itself does not tell ^ s s 
Story. Much depends upon how the c > vla 
treated and whether or not the total ay rep" 
tion of employees on a particular job, OF * 


AAT 


wa 


li 
1930 1935 1940 1945 


The Adequacy of Employee Selection Reports 


resentative sample thereof, was used. Never- 
theless, it is insturctive to analyze the trend in 
this respect. The results of the analysis by 
number of subjects is as follows: Less than 10, 
17; 10-19, 97; 20-29, 03; 30-49, 129; 50-99, 
188; and 100 and above, 257. Even more un- 
expected than the number of groups with small 
N is the number with 50 or more and the 257 
Broups containing 100 or more subjects. The 
latter are by and large the more recent studies 
and the trend is encouraging. 


Statistical Techniques 


An analysis of the statistical techniques used 
for presentation of the results of validation 
Procedures is interesting, but again the particu- 
lar Statistic used does not guarantee adequacy 
of treatment because the assumptions govern- 
ing its use may not have been met and the 
Statistic best suited to a given problem may 
not have been chosen. Table 1 shows the 
frequency with which various measures are 
"Sed. Correlational techniques are the most 
Popular, accounting for 285 out of 525 sepa- 
Tately-treated groups. Of these only 172 give 
measures of significance, and although they can 

€ calculated from the data provided, it is 
Safer, considering the heterogeneous nature of 

e audience in this field, to present the stand- 
ard errors along with the coefficients of correla- 
rec Furthermore, the author has. the real 
fae for complete presentation of all 
his hri aie necessary to an interpretation of 
un qs iol Occasionally one even finds 
Do lor concluding that the correlations re- 

rled have clearly shown a relationship be- 


Table 1 


Statistical Measures Used for Validation 


Measure of Number of Groups 


Correlation So Treated 
r 136 
Rho 94 
R 35 
This 7 
tetrachoric 8 
other 5 
Total 285 
E TOup Comparison 185 
nadequate Treatment 55 


221 


tween test scores and criterion whereas actual 
calculation shows the correlations to be not 
significantly different from zero. This practice 
is general and no single individual should 
shoulder the blame for it. 

Group comparisons of various sorts account 
for 185 cases, but of these only 28 include 
measures of the significance of group differences 
(although sometimes such measures could be 
calculated by the reader). Group comparisons 
may take such forms as: differences in mean 
test scores between the upper and lower 50% 
of employees as judged by the criterion, or 
average test scores for groups judged best, 
average and poorest by their supervisors (many 
times without N or sigma for each group being ' 
indicated), or per cent of those scoring within 
certain limits who were judged good as against 
the per cent who were judged poor, etc. Only 
occasionally are critical ratios or /-ratios or 
similar measures included. The importance of 
testing results for significance cannot be over- 
emphasized. In the case of group comparisons 
it is more serious than where correlation is 
used because in most cases there are not suffi- 
cient data to enable the reader to perform the 
proper calculations for himself. In one partic- 
ular case, where the author concluded that 
his tests were efficacious for selection but 
neglected to supply any measure of the signifi- 
cance of the differences found between groups, 
calculation of the significance of percentage dif- 
ferences (the only data available) showed the 
differences to be exceedingly insignificant. 

In 55 cases there is incomplete statistical 
analysis. Ina few cases the raw data are pre- 
sented with no summary statistics. In many 
instances we find the results expressed only as 
“per cent agreement" between test scores and 
criterion scores, or a brief statement that a 
critical score of a certain magnitude would 
have eliminated a given per cent of the poor 
group and ordinarily a smaller per cent— 
although we do not know that it is a reliably 
smaller per cent—of the good group. In 5 
cases the authors are content to present graphs 
alone, sometimes with the differences very 
much exaggerated by the scale and baseline 
chosen. 

One gains the impression that many times, 
even where adequate statistics are used, the 
basic requirements for their use have not been 


222 


met. One should be able to assume, for ex- 
ample, that when an r is reported the conditions 
for its proper use have been met, but in view of 
the inadequacy of many of the statistical treat- 
ments one cannot always so assume. A more 
obvious criticism of many studies is the manner 
in which subjects are selected. In spite of the 
fact that the assumptions underlying many of 
the statistics used require a reasonably random 
sample, biased rather than random sampling 
seems to be the rule. It is a common pro- 
cedure to select certain employees to serve as 
subjects but rarely are we given any informa- 
tion which indicates that the sample was a 
selected one or how it was selected. A frequent 
practice is the artificial creation of a hetero- 
geneous experimental group by the use of only 
extreme employee groups (the upper and lower 
25%, for example), a practice which may 
spuriously raise the validity coefficients. 

A point too often overlooked is that a selec- 
tion program is intended to select among 
applicants, not among employees, and the two 
groups are not identical (cf. 5, 7, 10). ‘“Natu- 
ral selection" on the job—the survival of the 
fittest—operates to make the employee group 
more homogeneous than the applicant group. 
This may spuriously lower validity coefficients 
and change critical scores. Further, the em- 
ployee group will often not show a normal dis- 
tribution in a trait which is highly correlated 
with ability to produce on the job and since 
in most industrial situations it will be im- 
possible to correct for this error, the usefulness 
of the employee group as a basis for a selection 
program is further limited. The best practical 
solution to both problems—that of bias in 
sampling and that of restriction of range in em- 
ployee groups—seems to be the use of two 
groups: first, a randomly selected employee 
group as a trial group, for reasons of economy, 
and second, an unselected applicant group as 
a follow-up group, to discover whether or not 
the selection program will select among appli- 

* The criticism of the use of statistical methods in 
research on the Rorschach Test by Cronbach (1) may 
be applied in part to employee selection research even 
where other tests are used. “Especially to be noted are 
his criticisms of the selection for emphasis of a few 

Significant” differences from among many insignificant 


ones, whether the comparisons are explicit] 
eth: >, made or 
merely implied, and his insiste aie d 


nce on the use of a second 
independent sample so that 


: e chance variations in test 
Scores will not be given undue weight. 


Margaret Hubbard Jones 


cants as wellas among employees. This solu- 
tion, the use of separate groups, has the further 
advantage of permitting a pragmatic estimate 
of the shrinkage in multiple correlation. This 
is a real advantage. An example is Selover's 
two samples of clerical workers (N= 193 and 
85, respectively) which yielded multiple cor- 
relations for 4 tests and criterion of .41 and 33, 
respectively (9). An instructive example of 
the danger involved in putting one's faith in à 
single small sample, particularly when that 
sample has been used to develop a scoring 
procedure, is given by Kurtz (4). Here ? 
scoring technique for the Rorschach Test va 
developed which classified correctly 79 out h 
80 sales managers. This was so impressive : 
many people concerned that they were n 
pared to start using it as a selection du 
immediately. A follow-up on a second samp 
yielded a validity coefficient of .02! | " 
The criterion is, of course, a question of thé 
most importance but we cannot discuss all ly 
ramifications of the problem here. p 
aside from the question of the applicability a 
of the criterion as a real measure of job apne 
—the validity of the criterion—we find a te 
lem in the reliability of the criterion. je 
95 reports, or 22% of the 427 acceptab i e 
ports, make some attempt to include eta e 
of the reliability of the criterion, and yet p the 
a profound influence upon the results 9 elia- 
validation procedures. Of course, lord 
bility will not give spuriously high validi "s 
rather the opposite—but many studies P 
to lead to the conclusion that certain en 
worthless in a given situation, wherea 


terion should be ascertained. — . 
Another difficulty in connectio 
criterion is the operation of externa 
such as age, experience and length © B 
the job? Unless cognizance is taken iets 
variables the results are difficult to og of 
say the least, and few studies contro’ " | cee 
these factors. For example, it is ¢5) 


n with e 
] influent 
time 
these 
o 


iret 
2e in the di 

2A te for bias n © n 
An attempt to compensa a recent P john 


of longer service may be found in > and 
Rundquist and Bittner (8), and McMurry pject® * 
attempted to secure ratings On, thes) 
approximately equal time on the job (0)- 


— 


vM 


The Adequacy of Employee Selection Reports 223 


how age may be predictive of job success if age 
is influencing the criterion either in its own 
right or operating through length of service, 
and yet most studies lump together not only 
all age groups but employees with widely dif- 
fering lengths of service. On the other hand, 
if ge or length of service is influencing the 
Criterion a significant relationship with test 
Scores may be masked. One often suspects a 
further contaminating factor when ratings are 
Used if test scores are not kept strictly con- 
fidential until after ratings are made. The 
facts seem to indicate that more attention must 
be paid to proper experimental design if the 
Tesults of selection studies are to be useful. 

How many reports, then, meet all criteria of 
adequacy in both experimental design and 
report? We find that 46 out of the 427 
originally selected contain no second or follow- 
UP group but are acceptable in all other re- 
Spects, such as sufficiently large N, adequate 
and complete statistical presentation through- 
Out, etc. We further find that 17 are adequate 
In all respects except that no measure of the re- 
liability of the criterion is presented. Finally, 
1 we count the total number of reports which 
are Satisfactory in all respects we discover 
only eight, or 4% of the 2100 references with 
Which we Started. These eight studies are as 
Ollows: 


k Bellows, R.M. Studiesofclerical workers. Chap. 
VIll in Stead, W. H., Shartle, C. L, etal. Occu- 
Pational counseling techniques. New York: Ameri- 
can Book Co., 1940, ix 4-273, pp. 144-146. 

2 (Study of coding clerks.) ; 

* Blum, M., and Candee, B. The selection of de- 
partment store packers and wrappers with the aid 
of certain Psychological tests. J. appl. Psychol., 
1941, 25. 76 4 5. 

Guilford, J. P., and Comrey, A. L. Prediction of 
Proficiency of administrative personnel from per- 
ju history data. Educ. psychol. Measmt., 1948, 

» 461—296. 

Holliday, F. The relation between psychological 
test Scores and subsequent proficiency of appren- 
ntes ìn the engineering industry. Occup. Psy- 

8. o l> Lond., 1943, 17, 168-185. 

"SIUS JE Endler, O. L., and Kolbe, L. E. Data- 
spèis methods, Chap. VII in Stead, W. H., 
Sartle, C, L., et al. Occupational counseling tech- 
ates. New York: American Book Co., 1940, 


xad duist, E. A, and Bittner, R. H. Using 
in ings to validate personnel instruments: m 
Method, Personnel Psychol., 1948, 1, 163-183. 


7. Sartain, A.Q. Relation between scores on certain 
standard tests and supervisory success in an air- 
craft factory. J. appl. Psychol., 1946, 30, 328- 
332. 

8. Selover, R. B. The development and validation 
of a battery of tests for the selection of clerical 
workers. Amer. Psychologist, 1948, 3, 291-202 
(abstract), and personal communication. 


It is not intended to imply that these studies 
found highly predictive test batteries, but 
merely that the technique was adequate. Con- 
clusive negative findings are important and are 
too frequently ignored or even suppressed. 

In conclusion, let me emphasize two points. 
First, the actual work done by industrial 
psychologists is not as bad as would appear 
from this analysis, and the trend is definitely 
toward more complete and careful design and 
execution. In many cases our criticisms apply 
to the reports, not necessarily to the studies 
themselves. More care should be taken in the 
preparation of reports so that all relevant in- 
formation is available to the reader. 


Requirements of a ‘‘Good Report” 


Perhaps a summary of the items one weary 
abstractor would like to see made explicit 
would be in order: 

1. Detailed job description, with each group 
treated separately. 

2. Complete description of the sample: ? 
(sufficiently large), what proportion of the total 
population this represents and how selected, 
factors involved in hiring, age, length of time 
on the job (preferably with widely differing em- 
ployees treated as separate groups), and total 
experience in jobs of similar nature; use of two 
samples, one an applicant group. 

3. Exact test titles; when in the employment 
experience the tests were administered ; whether 
the tests were a factor in hiring; where the 
tests were given; under what conditions and 
incentives the tests were given; reliabilities of 
tests with comparable groups. 

4. Detailed description of the criterion; 
length of time on the job when the criterion 
measure was applied (with widely differing em- 
ployees treated as separate groups); reliability 
of the criterion; some discussion of the validity 
of the criterion selected; if ratings are used, 
some estimate of the amount of contact the 
rater has with the employee; if production 


224 Margaret Hubbard Jones 


records are used, the duration of the period and 
whether there were any unusual factors oper- 
ating at that time. 

5. Adequate statistical treatment, with as- 
surance that the assumptions governing the 
use of the given measures have been met, and 
actual report of the numerical results, together 
with an appropriate measure of significance. 


This may seem like a large order, but many 
adequately executed studies already reported 
could have included most of the items since it 
is obvious from certain remarks that the author 
must have taken them into consideration. In 
view of the untrustworthiness of many reports 
these items should be made explicit. 

A final point concerns those studies done by 
inadequately trained personnel. There are 
many of these and they are quite useless. 
They point to the ultimate desirability of some 
method of identification of properly qualified 
personnel for employee selection programs. 


Summary 


A survey of more than 2,100 references on 
employee selection in industry has revealed 
that only 427 contain sufficient information to 
permit evaluation of the study. These 427 
reports are analyzed in terms of annual volume, 
jobs most frequently investigated, statistics 
used in presentation of validity, number of 
subjects and general adequacy of design. 
This analysis reveals that many of these studies 
are inadequate to permit drawing conclusions 
as to the efficacy of the selection procedures 


employed. Factors which influence results 
but are difficult to evaluate from reports as 
they are usually published are discussed. 
Some recommendations for items to be included 
in reports of employee selection, programs are 
presented. 


Received October 17, 19-19. 


References 


1. Cronbach, L. J. Statistical methods applied to 
Rorschach scores. Psychol. Bull., 1949, 46, 393- 
429. 

2. Dorcus, R. M., and Jones, M. H. Handbook of 
employee selection. New York: McGraw-Hill, 
1950. 

3. Ghiselli, E. E. The validity of commonly employed 
occupational tests. Univ. of Calif. Publ. in 
Psychol., 1949, 5 (9), 253-288. 

4, Kurtz, A. K. A research test of the Rorschach 
Test. Personnel Psychol., 1948, 1, 41-51. 

. MacMillan, M. H., and Rothe, H. F. Additional 
distributions of test scores of industrial em- 
ployees and applicants. J. appl. Psychol., 1948, 
32, 210-274. 

6. McMurry, R. N., and Johnson, D. L. Develop- 
ment of instruments for selecting and placing 

factory employees. Advanc. Mgmt., 1945, 10, 
113-120. 

7. Rothe, H. F. Distribution of test scores of indus- 
trial employees and applicants. J. appl. Psy- 
chol., 1947, 31, 480-483. 

8. Rundquist, E. A., and Bittner, R.H. Using ratings 
to validate personnel instruments: a study in 
method. Personnel Psychol., 1948, 1, 163-183. 

9. Selover, R. B. The development and validation 
of a battery of tests for the selection of clerical 
workers. Amer. Psychologist, 1948, 3, 291-292, 
and personal communication. 

10. Stromberg, E. L. Testing programs draw better 
applicants. Personnel Psychol., 1948, 1, 21-29. 


tn 


" 


wem 


Cross Validation of an Abbreviated Point Job Evaluation System 


Milton K. Davis and Joseph Tiffin 


Occupational Research Center, Purdue University 


Job evaluation has become widely used as a 
means of establishing equitable wage rates. 
Although there are many different methods 
available for constructing job evaluation sys- 
tems, the most widely used technique involves 
the use of some type of a point scale. The ap- 
proach used in point systems is to break the 
job into the various component items and as- 
Sign points to each of these. The total points 
for all these items represent the evaluated job. 
During recent years there has been an attempt 
to construct scales which use fewer items than 
the longer and more involved ones. 

Previous Studies. Lawshe and his associates 
(3, 4, 5, 6, 7) have published much material 
on such abbreviated scales. Chesler (1) has 
also published research on this subject. The 
procedure used in these studies was to select 
the three or four most important items from 
the longer system by the Wherry-Doolittle 
technique. Abbreviated scales thus derived 
were then compared with the original scales to 
determine the amount of agreement between 
the two. These studies have demonstrated 
that these abbreviated scales yield results 
Which are comparable to the original system. 

However, there has been some criticism 
against this approach since it was necessary 
first to analyze the original scale in order to 
derive the abbreviated one. Otis (8, p. 98) has 
stated this criticism in his recent book on job 
evaluation as follows: “How would one go 
about building an abbreviated scale? There 
is no way of knowing which one of the shorter 
Scales obtained by Lawshe would best apply 
to a given plant. Constructing a shorter scale 
Would necessitate a complete job evaluation 
using a longer scale. Either factor analysis or 
the Wherry-Doolittle technique would have to 
be applied to the data, and a shorter scale so 
derived would be used to keep the system up 
to date. These savings would not be very 
Breat.” 

Purpose of this Study. 
Was conducted to determine whether an 


The present study 
ab- 


225 


breviated job evaluation scale, constructed in 
the light of key items previously identified by 
Lawshe (3, 4, 5), will achieve the same basic 
result as longer job evaluation systems. This 
abbreviated job evaluation system was de- 
rived without making a Wherry-Doolittle or 
factor analysis of the specific job evaluation 
data involved. Thus, the present study repre- 
sents a cross-validation of this abbreviated job 
evaluation scale against the hold-out group of 
other job evaluation installations. 


Procedure 


Derivation of the Abbreviated Scale. The ab- 
breviated point scale used in the present study 
was derived from the results of earlier research 
by Lawshe (4). That study analyzed the 
NEMA job evaluation system as it operated in 
three industrial plants. The NEMA job 
evaluation system, which consists of eleven 
items, was adapted by Kress (2) from the West- 
ern Electric Company’s procedure for use by 
the National Electric Manufacturer’s Associa- 
tion. Itis one of the most widely used point 
job evaluation systems. 

From the three multiple regression equations 
obtained in Lawshe’s study, an average mul- 
tiple regression equation was determined for 
cross-validation. Only those items which ap- 
peared in at least two of the three regression 
equations were used. These items were ex- 
perience, unavoidable hazards, and initiative 
and ingenuity. Experience was the only item 
which was present in all three equations. The 
average value for each of these three items 
plus the average value for the constant was 
selected as the basis for further study. The 
final equation for predicting total points on the 
original system on the basis of the three items 
mentioned was: 

Total points = 41--1.5 (experience points) 
+41.7 (initiative and ingenuity points) +3.8 
(hazards points). 

Eight different companies submitted point 


226 


job evaluation data to Purdue University. 
Each of the sets óf job evaluation data had 
been obtained with a system that included the 
three items mentioned above among a con- 
siderably larger number of items. All of the 
jobs represented were for hourly-paid indus- 
trial workers. Three of these installations 
were NEMA plans and the remaining five were 
other types of point rating systems. The 
total points for the abbreviated scale were 
calculated for each job from the equation 
given above. 


Analysis and Results 


Three types of analysis of the data were 
made. The first was a correlational analysis. 
Table 1 presents the correlations for each 
company between the total points from the 
regression equation on the three items and the 
original total points. Examination of this 
table shows that for those companies using 
the NEMA system the correlations were all 
94 or above. Since these results were ob- 
tained from a multiple regression equation, the 
correlations presented correspond to multiple 
correlations between the abbreviated scale and 
the total point scale. It should be kept in 
mind, however, that since the multiple regres- 
sion equation used came from previous studies 
of other plant installations, the correlations ob- 
tained by no means represent the maximum val- 
ues that would have been obtained if the present 
data had been used in determining the multiple 
regression equation. The standard errors of 
the correlations are so small that any possi- 
bility of these representing chance fluctuations 


Table 1 


Correlations Between Three-Item Total and 
Total Points 


No. of 

Company Jobs r SE. 
No. 1 NEMA 341 .96 .054 
No. 2 NEMA 126 95 089 
No. 3 NEMA 605 94 O41 

Non-NEMA Systems 

No. 4 (12 items) 61 98 .129 
No. 5 (11 items) 17 .69 A15 
No. 6 (10 items) 185 96 074 
No. 7 (11 items) 273 96 .061 
No. 8 (23 items) 253 91 .063 


Milton K. Davis and Joseph Tiffin 


from zero is virtually eliminated. Therefore, 
these correlations represent the extent of agree- 
ment between a single abbreviated point scale 
and the NEMA scale as it operated in three 
separate companies. 

The second part of Table 1 shows the correla- 
tions between the total points from the regres- 
sion equation on the three items and the 
original total points for the non-NEMA sys- 
tems studied. In view of the method used to 
derive this abbreviated scale, its application to 
the latter companies might be open to question. 
However, these correlations were determined 
to find how well this one shortened scale oper- 
ates for point scales which do not strictly 
follow the NEMA classifications and weights. 

All the non-NEMA company plans yielded 
correlations above .90 with one exception. 
Company No. 5, a power and light utility firm, 
used a system which placed more weight on the 
unavoidable hazards item than is true with the 
NEMA scale. This fact probably accounts for 
the lower correlation. 

Thus, the correlational analysis would T 
to indicate that this abbreviated point wins 
has validity not only for the NEMA re a 
but also for other point systems which a 
similar to the NEMA system and which inat 
the items in the present abbreviated point scale 

Another method of analysis was in termis n 
labor grade displacement. This analysis Wih 
made in two separate ways, the first of W p 
determined labor grade displacement, € Ses 
new total point values were determined s 
the regression equation previously piu 
and the size, number, and limits of the la dy 
grades for the simplified scale were pres 
the same as those on the original scale. ma i 
fore, the labor grades of the company scale si^ 
resent the criterion against which the EDU 
ated scale labor grades were compared. ix 
analysis could be made only for the first »- 
companies mentioned in Table 1 because labor 
grades were not available for companies 
5 and 6. 2 

These results are summarized in Table P 
If the same or next adjacent labor grade eA 
set as an acceptable standard, it can be 5€ at 
that for the three NEMA installations 97 p 
cent, 96 per cent, and 94.5 per cent of the JO d 
meet this criterion. In company No. 2 es 
are 1.6 per cent of the jobs, representing 0nlY 


m 


E» 


Ww 


Cross Validation of Abbreviated Point Job Evaluation System 227 


two jobs, which would be displaced by three 
labor grades. 

These results compare favorably with those 
previously reported by Lawshe (4). In plant 
A of his study, 99.2 per cent of the jobs would 
meet the criterion of the same or next adjacent 
labor grade. 

For the three companies with the non-NEMA 
point rating scales, the labor grade displacement 
was much greater. Table 2 shows that the per- 
centages of jobs in the same or next adjacent 
labor grades for the three non-NEMA systems 
Studied were, respectively, 42.6, 18.2, and zero. 

Such low correspondence might seem to in- 
validate the use of the abbreviated scale with 
non-NEMA systems. However, a final analy- 
Sis was made which overcame much of this 
difficulty. This analysis involved a determina- 
tion of the range of points from the lowest to 
the highest evaluated job on the abbreviated 
Scale. This range was then divided into the 
Same number and with the same relative width 
of labor grades that existed for the same jobs 
on the original scale. Also, the lowest evalu- 
ated job on the abbreviated scale represented 
the lower limit of the first labor grade for the 
abbreviated scale. This method gave a con- 
sistent and reproducible approach for setting 
these labor grades. 

Table 3 shows the results of this analysis. 
The percentage of jobs falling in the same labor 
grade, displaced one, two, and three or more 
labor grades is shown. 


Table 2 


Labor Grade Displacement: Based on the Abbreviated 
Point Scale with Number and Limits of 
Labor Grades the Same as in 
the Original System 


Per Cent of Jobs 


Displaced 
Displaced Displaced Three or 
Same One Two More 
Labor Labor Labor Labor 
Company Grade Grade Grades Grades 
No.1 NEMA 45.0 52.0 3.0 
No.2NEMA 525 43.5 24 1.6 
No.3NEMA 446 499 5.5 
ide 98 — 328 37.8 19.6 
is 65 17 35.0 46.8 
No.6 l 100.0 


Table 3 


Labor Grade Displacement: Based on Proper Number 
of Labor Grades within the Abbre- 
viated Point Range 


Per Cent of Jobs 


Displaced Displaced Displaced 


Same One Two Three 
: Labor Labor Labor Labor 
Company Grade Grade Grades Grades 
No.1NEMA 68.6 29.9 1.5 
No.2NEMA 56.3 39.7 24 1.6 
No.3 NEMA 44.8 49.4 5.8 
No. 4 42.8 47.6 9.6 
No.5 31.2 49.3 19.5 
No. 6 36.8 49.2 > 135 fs 


Table 3 demonstrates that for the three 
NEMA system companies 98.5 per cent, 96 
per-cent, and 94.2 per cent of the jobs fall into 
the same or next adjacent labor grade when 
the abbreviated point range method is used. 
However, these results do not differ materially 
from those obtained by the method which 
used for the abbreviated system labor grade 
classification the same size, number, and limits 
of labor grades as those used in the original 
system. 

For the three non-NEMA systems there is 
definite improvement in accuracy of job place- 
ment by labor grade when the second method 
for placement is used. The respective per- 
centages were 90.4, 80.5, and 86 for displace- 
ment of less than two labor grades. In addi- 
tion, only one job in company No. 6 would be 
displaced by three labor grades. 

The results thus show that the agreement 
between the labor grade placement of jobs by 
the long and the short scale is essentially the 
same when the original size and limits of labor 
grades are used for the short scale as when a 
proper division of the abbreviated point scale 
in the labor grades is used with thisscale. For 
the non-NEMA systems, the proper allocation 
of labor grades within the abbreviated point 
range appears to result in greater agreement 
between labor grade placement by the long 
and short scales. 


Summary 


An abbreviated point job evaluation scale 
was constructed on the basis of prior research 


228 


on the NEMA job evaluation system. This 
abbreviated scale was based upon the average 
of three multiple regression equations found 
in that earlier study. 

Eight job evaluation installations were 
studied for cross-validation of this one abbre- 
viated scale. These installations included 
three NEMA systems and five non-NEMA 
point systems. All the jobs represented by 
these plans were hourly-paid. 

Zero-order correlations between the total 
points of the abbreviated scale and the longer 
original scale were calculated for each of the 
eight installations. The amount of labor 
grade displacement was found by two methods. 
The first method compared the labor grade 
displacement when the labor grades for the ab- 
breviated system were set up using the same 
limits as the original installation. In the 
second method for comparison of labor grade 
displacement, the abbreviated scale labor 
grades were set up by dividing the abbreviated 
scale range into the same number of labor 
grades as the original scale used for those jobs. 

On the basis of this study, the following 
conclusions are supported: 


1. For the NEMA installations, the correla- 
tions between abbreviated scale total points 
and the original scale total points were .96, 
95, and .94. 

2. This abbreviated scale will operate nearly 
as effectively for non-NEMA point job evalua- 
tion systems. With one exception, the ob- 
tained correlations of total points between the 
simplified scale and the non-NEMA point sys- 
tems were above .90. A main requirement in 
applying this abbreviated scale to non-NEMA 
Systems is that these other systems have items 
which closely approximate the three chosen 
items, 

3. Labor grade displacement results show 
that 97 per cent, 96 per cent, and 94.5 per cent 
of the jobs in the NEMA installations would 
remain in the same or next adjacent labor 
grades when predicted from the abbreviated 
scale. These results do not change materially 


Milton K. Davis and Joseph Tiffin 


when labor grades are set up using the abbre- 
viated point range. 

4. For the non-NEMA point systems stud- 
ied, the superior approach was to divide the 
abbreviated scale point range into labor grades, 
the number and relative width of which are 
determined by the number and relative width 
of labor grades in the original scale. This 
method yielded 90.4 per cent, 80.5 per cent, and 
86 per cent in the same or next adjacent labor 
grade. . 

5. From the results of the present study, it 
would seem that an abbreviated scale made 
up of three items—experience, hazards, and 
initiative and ingenuity—usually will achieve 
essentially the same results as a more extensive 
point system which includes these items. 


Received October 1, 1940. 


References 


1. Chesler, D. J. Abbreviated job evaluation systems 
derived on the basis of “internal” and “externa 
criteria. J. appl. Psychol., 1949, 33, ale 

2. Kress, A. L. How to rate jobs and men. ^"^ 
Mgmt., 1939, 10, 60-65. -— 

3. Lawshe, C. H., Jr., and Satter, G. A. Studies inj 
evaluation. 1. Factor analysis of point rat is 
for hourly-paid jobs in three industrial plants. 
J. appl. Psychol., 1944, 28, 189-198. E 2. 

4. Lawshe, C. H., Jr. Studies in job evaluation. fer 

The adequacy of abbreviated point ratings 

hourly-paid jobs in three industrial plants: 

appl. Psychol., 1945, 29, 177-184. sso difésidin 

5. Lawshe, C. H., Jr., and Maleski, A. A. SUPE 
job evaluation. 3. An analysis of point Ta lant. 
for salary paid jobs in an industrial p! 
J. appl. Psychol., 1946, 30, 117-128. in job 

6. Lawshe, C. H., Jr., and Alessi, S. L. Studies 2 ling 
evaluation. 4. Analysis of another point I o 
scale for hourly-paid jobs and the adequacy ^ 
an abbreviated point scale. J. appl. Psy 
1946, 30, 310-319. ies in 

7. Lawshe, C. H., Jr., and Wilson, R. F. ewe 
job evaluation. 5. An analysis of the agi 
comparison system as it functions in cu 
mill. J. appl. Psychol., 1946, 35, 426-433. tion- 

8. Otis, J. Ly and Leukart, R. H. Job evaluat 
New York: Prentice-Hal!, 1948. ;stical 

9. Peters, C. C., and Van Voorhis, W. R. Stati 
procedures and their mathematical bases. 
York: McGraw-Hill Book Co., 1940. 


WP eg 


Age and Route Sales Efficiency * 


C. B. Cover and S. L. Pressey 
Ohio State University 


In the past fifty years, the average length 
of life in this country has increased 18 years, 
from about 49 to 67; and the proportion of 
people 45 and over has grown from 18% to 29% 
(4, pp. 50 and 256). Also, “there has also been 
a tendency over a long period of time to lower 
the age at which workers retire from active 
employment.” And pressure for pension plans 
has accentuated a “wide-scale prejudice against 
hiring of the worker over 45 (2, pp. 63 and 20). 
The plight of the older workers thus mounts 
at the same time that their number increases. 

Fortunately, the situation is receiving in- 
creasing attention, as evidenced by mentions in 
newspapers and magazines, and the efforts of 
such groups as the Desmond committee of 
the New York legislature (2, 3). Considera- 
tion has naturally tended to center on large- 
scale industry. But other common types of 
work may present certain of these problems in 
acute form. An occupation which initially 
gave such good financial returns that it tended 
to hold men into middle life past the age of 
ready occupational shift, gave little by way of 
experience which might help in any shift, but 
then dropped them or became so burdensome 
that they dropped out, would seem to pre- 
sent such problems. If thereby the business 
lost older workers whose experience might 
in some way have been utilized, it also suffered. 
This paper attempts briefly to outline such a 
situation. 


Cases and Materials 


The present study deals with 92 men whose 
work consisted of house-to-house selling of 
foodstuffs from a truck, making collections, and 
canvassing for new customers. Each salesman 
operated two routes on alternate days and thus 
served each customer three times a week. 
Pay, on a commission basis, was excellent,—a 
hundred dollars or more a week. A successful 

, . Cover while a 
indi Ep reet en now on the staff 


of Ohio University but in September, 1950, will become 
Assistant Dean of Students at Muskingum College. 


229 


` customers. 


salesman needed to have a pleasing personality, 
skill in selling, and enterprise in soliciting new 
Presumably, all this involved 
knowledge of his goods, of demand in different 
neighborhoods, and of the likes of each cus- 
tomer. Since many of the sales were on credit, 
the man needed to be shrewd in his appraisal 
of customers; he needed also to estimate closely 
goods needed on each run so that there would 
be enough, but little surplus which might spoil 
or be salable only at a discount. He had also 
to be careful and efficient in his handling of his 
truck. Finally, the route salesman needed the 
physical stamina required for repeatedly get- 
ting in and out of his truck and going up and 
down steps carrying a basket loaded with 
merchandise, eight or nine hours a day, six 
days a week. 

The findings are based on rank order ratings 
by sales supervisors of these men in: (1) total 
sales; (2) credit; (3) surplus; (4) truck repair 
and accidents; and (5) time. Ratings were 
based largely upon records and were thus fairly 
objective. 

Results 

The important features of the findings are 
exhibited in the following tables. In Table 1, 
the number at each age, as shown by the row 
of figures next to the bottom, should first be 
noted. Most men were in their twenties and 
thirties, with few staying beyond these years 
and only one (included with those in their 
fifties) past 60. Moreover, the first part of 
Table 1 shows that the older men who did 
continue fell off markedly in the basic issue of 
amount of sales. The twenty best salesmen 
are allunder 40. Median rank drops markedly 
from the thirties to the fifties. A similar fall- 
ing off was found in efficiency in use of time. 

In contrast, the second part of Table 1 shows 
increasing efficiency with age in truck opera- 
tion, as shown by repair costs and accidents, 
up through the forties; the twenty-two drivers 
who were poorest in these respects were all 
under 40. There was a similar improvement 


230 C. B. Cover and S. L. Pressey 


with age in handling of credit, while judgment 
regarding surplus remained about the same. 
In short, though judgment appears to Improve 
with age, a falling off in energy appears to 
cause a decrease in effectiveness in route 
salesmen. 

The question naturally arises as to whether, 
if years of service instead of simply age were 
considered, ratings would be different. Table 
2 so groups the cases, and in terms of ratings 
for total efficiency. Here the poorer ratings of 
the older salesmen are even more clearly ex- 
hibited! The two who have been salesmen 
longest are among the poorest. 

The total task involved in these route sales 
jobs thus seems to be such that men do not 
maintain their efficiency past early middle life 
and most drop out. What happens to them? 
The truck sales positions pay very well and 
they have high status. Few of the men are 
willing to drop back to a much more poorly 
paying sales position in one of the company’s 
stores. And as the business is organized, there 
are few supervisory or production jobs into 
which these men can go. As a result, though 
intimately familiar with the product sold and 
with sales problems, and potentially of con- 
tinuing use to the firm, they commonly leave 
the business in their 40’s. Some of them 


Table 1 


Rank Order in Total Sales and Truck Operation and 
Accidents of 92 Route Salesmen, as 
Related to Age * 


Age Groups: Total Age and Truck 
Sales Operation 


20-9 30-9 40-9 50-9 20-9 30-9 40-9 50-9 


Rank 
Order 


1-10 
11-20 
21-30 
31-40 
41-50 
51-60 
61-70 
71-80 
81-90 
91-100 


Total 29 42 13 
Median 52 36 46 8i 


P ww UE WwW Ww 
oper RAR 
ENN eww 

BUNUN N Ne e 

Ó Uu uooUuUt-oots 

BNE on n NU 


29 42 13 8 
64 41 18 36 


$o|totnmt 


* Median rank 


bold-face type. Jor each age group is shown by Nin 


Table 2 


Total Efficiency of 92 Salesmen in Relation to 
Years of Service * 


Years of Service 


one 0-9 10-19 20-29 30-39 
1-10 8 2 

11-20 9 1 

21-30 9 1 

31-40 9 1 

41-50 8 1 1 

51-60 8 1 1 

61-70 8 2 

71-80 9 1 

81-90 5 2 1 2 

91-100 2 — 

Total 75 8 7 2 

Median 43 36 68 86 


* Median rank order for each length of service group 
is shown by N in bold-face type. 


thereafter may do well. But most of them, 
so far as known, find no other satisfactory 
position at that age. And they become dis- 
satisfied subsistence farmers, insurance sales- 
men, handymen, or otherwise on the marg 
of things vocationally. 


Discussion 


The data are indeed inadequate. Having 
been obtained after the war, they are UT 
doubtedly influenced in various ways bY er 
cumstances related thereto. About the a 
careers of all too large a portion of the qu 
who had left, nothing was known. But know 
outcomes were so predominantly unsatisfactory 
that they seemed to point up certain pee 
of management and employee welfare. Mig * 
truck sales work be in some way made ee 
fatiguing for older men, so that they m!8 e 
continuelonger? Might the concern find or 
positions in its own organization into Y 
these men might shift? Could the men's atiz 
tudes regarding the status of indoor selling Jo 
be so modified that they would be more Bp 
to take them? Or could employment eor 
find more positions into which men with wed 
experience could go, when they are in e 
forties or fifties? Situations of the type a 
scribed in this brief paper seem not econom} 
cally healthy, and in need of continuing study 


Age and Route Sales Efficiency 231 


Summary 


With the increasing length of life and number 
of older workers in this country, it becomes in- 
creasingly important to investigate the rela- 
tions of efficiency in different types of work to 
age, and to consider means by which indi- 
viduals who are in occupations not feasible for 
the older years may find other vocational 
opportunities. 


1. A study of 92 men selling foodstuffs from 
retail trucks showed increased efficiency with 
age in handling their trucks and in judgment in 
business relations with their customers (as in 
handling of credit), but a falling off in sales. 
The drain on physical energies involved in the 
Work appeared to be the chief factor. 

2. These men tended to leave such positions 
after 40; but most did not find other oppor- 
tunities in the firm nor (so far as known) 
locate satisfying employment elsewhere. 


3. The question is raised as to whether or 
not such work might be adapted so that older 
men could continue longer or other positions 
in the same firm found for them, which would 
utilize their experience. If not, might em- 
ployment offices find opportunities for such 
men which, to some degree, would take account 
of their vocational background, and be more 
suited to their age and needs? 


Received May 22, 1950. 
Early publication. 


References 


l. Clague, E. After 45,—How about a job? Survey 
Graphic, 1950, 86, 173-176. " 

2. Desmond, T. C., and others. Birthdays don't count. 
New York State Joint Legislative Committee on 
Problems of the Aging, Newburgh, N. Y., 1948. 

3. Desmond, T. C., and others. Never too old. New 
York State Joint Legislative Committee on 
Problems of the Aging, Newburgh, N. Y., 1949. 

4. Dublin, L. I., and others. Length of life (Rev. Ed.). 
New York: Ronald Press, 1949. 


Ortho-Rater Norms and Sex Differences 
J. H. Ely, N. C. Kephart, and Joseph Tiffin 


Occupational Research Center, Purdue University 


The growing use of the Ortho-Rater as an norms on the vision tests included in this in- 
industrial vision testing instrument has sug- strument. It is the purpose of the present 
gested the need for male and female industrial article to present such norms, based on quite 


Table 1 
Percentile Norms on Ortho-Rater Distance Acuity Tests 
Male Female 
Both Right Left Worse Both Right Left Worse 
Score Eyes Eye Eye Eye Eyes Eye Eye Eye 
0 1 1 1 3 1 1 1 2 
1 1 2 4 1 2 2 3 
2 1 3 3 5 i 2 3 5 
3 1 3 4 6 1 4 4 T 
4 1 4 5 8 2 5 6 10 
5 2 6 6 10 3 8 9 13 
6 3 8 9 14 6 12 14 20 
7 8 12 14 20 16 20 22 29 
8 14 22 22 31 27 37 38 48 
9 25 34 33 45 44 53 54 66 
10 37 43 44 56 62 67 60 79 
11 59 61 66 76 82 82 86 92 
12 87 92 91 98 95 97 97 99 
13 96 99 97 99 98 99 99 100 
14 98 99 99 100 99 100 100 100 
15 100 100 100 100 100 100 100 100 
N 7,659 — 7,657 — 7,648 — 7,646 2469 2,460 246; 2468 
Table 2 
Percentile Norms on Ortho-Rater Near Acuity Tests €— 
Male Female — 
Both Right ^ Left ^ Worse Both — Right — Left Worse 
Score Eyes Eye Eye Eye Eyes Tee Eye Bye 
0 1 2 3 4 1 1 1 ; 
1 1 3 3 6 1 2 2 : 
2 1 4 4 7 1 2 3 5 
3 2 6 6 9 1 3 4 i 
4 4 7 8 12 1 1 5 8 
5 6 9 10 14 2 5 6 H 
6 9 15 15 22 " 9 11 15 
7 16 20 23 29 10 15 17 25 
8 26 20 31 39 23 28 28 38 
9 46 49 46 60 48 59 53 n 
10 65 66 59 73 m 79 71 85 
n 87 80 77 87 93 92 89 92 
: 2 98 95 95 98 99 99 98 [a 
3 99 99 99 100 0 100 
A 100 100 101 
i 100 — 100 — 100 100 o 100 
15 M n 100 100 10 00 
100 100 100 100 100 1 
7,655 — 7,82 7,001 7,63 2468 2469 2468 2467 


232 


d^ — RN 


Ortho-Rater Norms and Sex Differences 233 


Table 3 


Percentile Norms on Ortho-Rater Vertical Phoria Tests 


Male Female 
Score Far Near Far Near 
Left 

Hyperphoria 1 f 3 1 3 
2 2 V 2 7 
3 6 21 6 20 
4 23 55 24 56 
5 62 85 62 86 
6 87 95 86 95 
7 96 97 94 96 
8 98 98 96 98 

Right 
Hyperphoria 9 100 100 100 100 
7AH 7,356 2,436 2,439 


Table 5 


Percentile Norms on Ortho-Rater Stereopsis 
and Color Tests 


Stereopsis Color 
Test Test 
Score Male Female Score Male Female 
0 10 18 0 1 1 
1 15 26 1 1 1 
2 22 35 2 5 6 
3 31 47 3 9 14 
4 40 58 4 25 39 
5 48 67 5 52 72 
6 62 79 6 100 100 
7 71 85 
8 79 89 N 75597 2,457 
9 100 100 
N 7,6418 2,464 


large samples of industrial employees. A 
Second purpose is to compare the mean scores 
and variability of men and women selected 
randomly from plants currently using the 
Bausch and Lomb Industrial Vision Service. 


Norms 
Tables 1 to 5, inclusive, give percentile norms 
on the several Ortho-Rater tests. These 


Table 4 


Percentile Norms on Ortho-Rater Lateral Phoria Tests 


Female 


Male 
Score Far Near Far Near 

Esophoria 1 3 3 4 3 
2 6 6 7 6 

3 9 10 10 12 

4 15 18 15 2 

5 24 30 23 36 

6 37 45 34 53 

7 54 60 49 67 

8 7:3 m 68 77 

9 8; 7 8i 83 

10 94 85 92 8 

11 96 89 96 90 

12 98 92 9; 93 

13 99 94 98 95 

14 100 96 99 97 

Exophoria — 15 100 100 100 100 


norms are based on approximately 7,600 men 
and 2,500 women. At the bottom of each 
column in the tables is given the exact number 
of cases on which the individual test norm is 
based. The valuesin the tables are percentiles, 

Table 6 summarizes the means, S.D.'s, and 
differences between males and females for the 
several tests in the Ortho-Rater. Quitea num- 
ber of the differences, both in means and S.D.'s, 
are significant at the 595 level or below. 
Since the number of cases was large, several of 
the differences which are actually very small, 
were found to be significant at the 5% or even 
1% level. Such differences are probably of 
more theoretical than practical importance. 

It will be noted that in the color vision test 
a difference in favor of the men was found. 
This difference was significant at the 1% level. 
The authors are aware that this finding is 
contrary to long accepted theories and facts 
about the distribution of color blindness among 
the sexes. The explanation for this difference 
in findings is not known. 

It should be kept in mind that while the data 
of this study were based on randomly selected 
men and women employed on industrial jobs, 
they may not be typical of randomly selected 
men and women from the general population. 
Traditional industrial practices of placing men 
and women on different types of jobs may have 


234 


J. H. Ely, N. C. Kepharl, and Joseph Tiffin 


Table 6 


Comparison of Male and Female Means and Standard Deviation on Ortho-Rater Vision Tests 
Note: Numbers of cases in various categories shown at bottom of Tables 1 to 5. 


Differ- . Differ- 

Male Female ence C.R. S.D.maie S.Dutemate ence C.R. 
Far Acuity, Both 10.69 9.64 105 21.88" 2.07 00 .00 
Far Acuity, Right 10.11 9.12 99 16.80* 2.63 11 2.62* 
Far Acuity, Left 10.06 8.97 .09 1.50 2.70 ag 3.10* 
Far Acuity, Worse 9.26 8.27 .99 15.47* 2.94 25 5.56* 
Near Acuity, Both 9.43 9.47 —.04 1.00 2.12 52 18.57* 
Near Acuity, Right 917 903 m 2.64* 237 65 Ua 
Near Acuity, Left 922 942 10 1.72** 2.94 59 14.39* 
Near Acuity, Worse 841 8.41 .00 .00 3.09 63 14.65* 
Far lateral phoria 7.07 7.25 —.18 2.90* 2.56 —.5 341* 
Far vertical phoria 5.26 5.28 —.02 .63 128 -—Ü 4.35* 
Near lateral phoria 7.22 6.80 42 5.60* 3.23 09 1:70** 
Near vertical phoria 438 440 —02 87 1.37 —.03 1,30 
Color 5.08 4.68 40 14.81* 1.14 = 9 1.05 
Stereopsis 5.23 3.97 126 18.53* 2.98 06 1.25 


* Significant at 1% level. 
** Significant at 5% level. 


resulted in certain secondary selection factors 
which would operate to make any randomly 
selected group of industrial employees (particu- 


larly women employees) different in certain 
respects from the general population. 


Received October 1, 1949. 


Fluorescent Light Versus Daylight 


J. Stanley Gray and Paul Prevetta 


University of Georgia 


Fluorescent lighting is comparatively new 
and few studies have been made of its effects 
on human vision. Luckiesh and Moss (1) 
compared fluorescent lights with tungsten 
lights of 20 foot-candles intensity by having 10 
subjects read for 30-minute periods on 9 dif- 
ferent days. The number of involuntary eye 
blinks during the first and last five minutes of 
each reading period was counted on the as- 
sumption that the frequency of blinks is related 
to the difficulty of seeing and the state of 
visual fatigue. They found that the increase 
in blinks was approximately the same for fluo- 
rescent lights as for tungsten lights. The small 
differences in increased blinking were not sta- 
tistically significant. 

However, Tinker (4) has criticized the rate 
of blinking as being a non-valid index of the 
visual function. He had 60 university stu- 
dents read under 10 foot-candles of well- 
diffused light and found that “the frequency of 
blinking is an unsatisfactory criterion of the 
readability of print." McNally (3) studied 
the relation of blink rate to reading various 
Sizes of type and found no correlation. The 
rate of blinking was unrelated to the size of 
type. McFarland, Holway, and Hurvich (2) 
found an inconsistent relationship between the 
rate of blinking and both the duration of read- 
ing and the intensity of illumination. They 
Concluded that blinking is not a reliable index 
of visual fatigue. 

The authors of the present study used the 
American Optical Company's Sight-screener to 
measure the effects of two hours of continuous 
reading under daylight as compared with two 
hours of continuous reading under fluorescent 
lights. The visual functions measured were 
acuity, stereopsis, and both lateral and vertical 
Phoria at 14 inches and at 20 feet distances. 
Fifty subjects (high school, college, and older 
adults) read books set in 8 point type for two 
Jours under daylight of 20 foot-candles inten- 
Sity and for two hours under fluorescent lights 
of 20 foot-candles intensity. For both reading 


sessions, the light source was behind and above; 
books were kept on an easel at right angles to 
the line of vision; intensity was maintained at 
20 foot-candles (venetian blinds were used to 
regulate the daylight); glare was eliminated by 
painting all peripheral background pastel 
green; the same book was read under both con- 
ditions; reading booths reduced outside dis- 
tractions; and the visual skills were measured 
(using the Sight-screener) at the beginning, at 
the end of one hour of reading, and at the end 
of two hours of reading for both sessions. 
Some of the subjects read first under fluorescent 
lighting and then under daylight, and others 
read first under daylight and then under 
fluorescent lights. 

There was both loss and gain in all four 
visual skills at both near and far measure- 
ments. Some subjects actually increased in 
certain visual skills after the two-hour reading 
period although most subjects lost in all skills 
except stereopsis. The summarized results are 
shown in Table 1. The differences between 


Table 1 


Visual Effects of Two Hours of Reading under Daylight 
and Fluorescent Light for Fifty Subjects 


Fluorescent 
(Mean Change) 


Daylight 
(Mean Change) 


Near Distance 


(14 inches) 
Acuity* —.04 —42 
Stereopsis* .08 .00 
Lateral Phoriaf —.31 —.02 
Vertical Phoria} —.01 05 
Far Distance 

(20 inches) 
Acuity* —.04 .02 
Stereopsis* —.20 EX 
Lateral Phoriat —.26 at 
Vertical Phoria —.02 ES! 


* Data in “units of change.” 
t Data in diopters. 


235 


236 J. Stanley Gray and Paul Prevetta \ 


daylight and fluorescent light were very small "References 


ne was statistically significant. x 
and none w y SIE 1. Luckiesh, M., and Moss, F. K. Vision and seeing 


under light from fluorescent lamps. J/hon. 
Engng., N. Y., 1942, 37, 81-88. 

The conclusion seems to be justified that, as — 2. McFarland, R. A., et al. Studies in visual fatigue. | 
measured by the sight screener, fluorescent Cambridge, Mass.: Harvard School of Business 
light of 20 foot-candles intensity is not inferior Administration, 1942. . 
to daylight of the same intensity for reading 3. Megi , T D E Lue td € pos 

H ; : Cs. each. Coll. Contr. Educ. , NO. . 

8 point type material for two hours duration. 4, Tinker, M. A. Validity of frequency of blinking as 
Received May 22, 1950. a criterion of readability. J. exper. Psychol, 

Early publication. 1946, 36, 453-460. 


Summary 


to 


zi 


Inconsistency in the Predictive Value of a Battery of Tests 


Robert M. W. Travers 
Division of Teacher Education, Board of Higher Education, New York City 


and 
Wimburn L. Wallace 


University of Massachuselts 


The increased number of applicants for ad- 
mission to professional schools in recent years 
has made administrators keenly aware of prob- 
lems of selection. A growing dissatisfaction 
has been felt with the traditional criteria for 
selection, such as undergraduate grades or 
letters of recommendation, and some attention 
has been devoted to the possibility of using 
tests for selective purposes. Various large- 
scale testing programs have been initiated for 
this purpose in many professional areas, while 
in others, smaller programs have been devel- 
oped for studying the problem of selection. 
The schools of dentistry have shown admirable 
and exemplary caution in this matter, and 
while the Association of Schools of Dentistry 
has sponsored an experimental testing program, 
it has not introduced the tests as a requirement 
for admission, prior to adequate investigation. 
The use of tests for the selection of students 
of dentistry is still largely experimental. 

During 1947 and 1948, the University of 
Michigan has administered tests to applicants 
for admission to its School of Dentistry. The 
primary purpose of this battery was to explore 
Possible improvements in the admission Sys- 
tem. The tests were selected on the basis of 
Previous studies which now have been ade- 
quately reviewed in a bulletin published by the 
Veterans Administration! and which need not 

€ reviewed again here. The one exception in 
this procedure was the inclusion of a test of 
"Effectiveness of Expression" which, it was 
believed, might measure a competency 1m- 
portant to the dentist though possibly of little 
importance in achieving satisfactory grades in 
denta] school. 

The following tests were included in the bat- 

1 Predicting success in training for dentistry. Veterans 


Administration Technical Bulletin TB 7-44, July 8, 
47, pp. 16. 


2. 


tery: ACE Psychological Examination, 1945 
Edition; MacQuarrie Test for Mechanical 
Ability; Bennett Test of Mechanical Compre- 
hension, Form BB; Revised Minnesota Paper 
Form Board Test, Series MB; Interpretation 
of Reading Materials in the Natural Sciences, 
College Level, ‘Tests of General Educational 
Development; and Effectiveness of Expression, 
Cooperative English Test B2, Form T. 

The tests were administered at the Uni- 
versity of Michigan and at centers throughout 
Michigan. A few centers were also established 
outside of the State. In the Spring, 1947 
testing program there were 32 centers, but 
when the program was given again in the 
spring of 1948 these centers were reduced to 18. 
At each center the appointed examiner was 
given a set of directions to be read ad verbatim 
to the examinees. 

The scores on the tests were correlated with 
the subsequent achievement of those admitted. 
These correlation coefficients are given in 
Table 1. 

This table shows an apparent inconsistency : 
the tests had some predictive value when they 
were given to the group admitted in 1948, but 
practically no predictive value when they were 
given to the previous class. A number of 
hypotheses suggest themselves concerning this 
inconsistency, the most obvious one being that 
the tests were improperly administered during 
1947 but were properly administered in 1948. 
This hypothesis is easily tested since a group of 
tests, similar in many respects to the tests 
discussed, was administered to all students 
after admission through a testing program 
sponsored by the American Dental Associa- 
tion. If the tests given in 1947 were im- 
properly administered one could expect that 
they would have low validity, but that the 
battery administered by the American Dental 


238 Robert M. W. Travers and Wimburn L. Wallace 


Table 1 


Coefficients of Correlation Between Honor Point Ratio and Test Scores 


1947 Admissions 


(N = 82) is 
ri ear Ist Year 
Honor Point Honor Paint Honor Point 
Tests Ratio Ratio "n Ratio 

AOE QuanBtative Scores sius x ze ix ra togas RR n ists —.02 —.08 d 
ACE Linguistic Score... : —.06 —.09 E: 
ACE Total Score... —.05 —.10 ps 
MacQuarrie ........... ss 0905 —.10 p 
Mechanical Comprehension. .......... ccce eese esc secus A4 07 25 
PBA DSC SHOR MUR GAN IS tcs e sp nescit ci tice jose ROMS mt -06 .05 sm 
GED Natural Science: 20 19 AS 
Effectiveness of Expression........0..00000000000cseceeees. OL —.02 36 


Association would not be affected by this 
factor and should have validity comparable 
with that of the tests given in 1948. 

The correlations of the tests in the American 
Dental Association battery with honor point 
ratio are summarized in Table 2. 

The tests of the American Dental Associa- 
tion show the same phenomenon exhibited by 
the University of Michigan tests. They have 
little predictive value for those admitted in 
1947 but substantial value in 1948, Hence, it 
is unlikely that improper administration of the 
1947 battery can account for the phenomenon. 

The next hypothesis concerning the change 
in the predictive value of the tests from 1947 
to 1948 is that the 1947 group exhibited a more 
limited range of ability. Table 3 shows the 
means and standard deviations of the scores 
of the two groups. 


The data presented do not support the iy 
pothesis that the lack of predictive value 2 
the tests given in 1947 is due to a restricte 
range of test scores or of achievement (HPR). 

A third hypothesis is that the wp 
grading changed from 1947 to 1948. Mesi 
hypothesis does not seem to be consistent plem 
the evidence. A discussion of the probe 
with administrative officials indicated that 
changes had been made in the grading d 

A fourth hypothesis is that the group eae 
in 1947 was in some way different from ilie 
Selected in 1948. Differences between n 
mean scores of the two groups are small vu 
substantially the same part of the apt 
Scale was used for prediction in the two "i the 
It is, however, possible that in the case “peen 
group admitted in 1947, there might have cer 
à tendency on the part of the admissions 0 


Table 2 


Coefficients of Correlation Between American Dental 


Association Tests and Honor Point Ratio 


" sions 
1947 Admissions 1948 aom s 


ar 
Ist Year 2nd Year Ist boim 
Honor Point ^ Honor Point Honor $ 
Tests Ratio Ratio i idee 
.05 i 
21 
.07 
Al 
.00 
34 
48 
EVI 
—.03 AT 
Object Visualization... m 25 
Natural Science Survey pa "m 


o 


Inconsistency in Predictive Value of Tests 239 
Table 3 
Standard Deviation and Mean of Test Scores of Those Admitted in 1947 and 1948 
1947 Admissions 1948 Admissions 

Standard Standard 

Mean Deviation Mean Deviation 
ACE Quantitative Score 51.6 84 50.5 7.94 
ACE Linguistic Score 75.2 11.8 78.2 12.94 
ACE Total Score 136.9 17.05 128.7 18.23 
MacQuarrie 78.8 12.25 78.0 10.46 
Mechanical Comprehension 41.7 8.73 37.0 10.87 
Paper Form Board 47.6 6.69 48.3 7.25 
GED Natural Sciences 56.8 10.82 50.3 11.00 
Effectiveness of Expression 46.4 8.77 43.6 10.36 
Honor Point Ratio (1st Yr.) 1.56 0.44 1.53 0.45 


to accept either high grades in pre-professional 
work or high test scores for admission. If this 
Were the case to any extent it might result in 
the selection of a group in which there was a 
correlation of zero between previous scholastic 
record and test scores. This seems to be the 
case, for as Table 4 demonstrates, the group 
Selected in 1947 shows a generally lower corre- 
lation between previous scholastic record and 
test scores than those selected in 1948. 


Table 4 


Coeflicients of Correlation Between Test Scores and 
Average Grade in Pre-professional Work 


1947 1948 

Admissions Admissions 
ACE Quantitative Score 22 28 
ACE Linguistic Score al 25 
ACE Total Score 19 30 
MacQuarrie —.29 02 
Mechanical Comprehension — —.18 324 
Paper Form Board .03 16 
GED Natural Sciences 22 37 
Effectiveness of Expression .07 25 


If the type of selective process under dis- 
cussion were operative in 1947 one would 
expect not only that the group selected would 
show exceptionally low correlations between 
test scores and previous scholastic record but 
also that there might be exceptionally low cor- 
relations between test scores and subsequent 
performance. 


Summary 


This study has been presented to illustrate 
the fact that the process of validating tests for 
admission to an educational program by finding 
the correlation with subsequent grades needs 
to be more carefully scrutinized. It is 
possible that the correlations between test 
scores and grades may be zero although the 
instrument is valid for selection purposes. 
The process of selection may control the size 
of validity coefficients and should be properly 
controlled in any experimental validation of 


a test. 
Received October 7, 1949. 


Intercorrelations in Merit Rating Traits 


C. E. Jurgensen 
Minneapolis Gas Company 


Rating scales used by different companies 
vary considerably in the number of traits rated. 
In an analysis of 132 rating scales, Mahler (6) 
found that the number of traits varied from one 
to thirty-three with a mean of 9.3. Com- 
panies using such scales appear to make two 
assumptions: (1) the traits rated are important 
from management's point of view, and (2) 
ratings on the various traits are discrete. This 
study deals only with the second of these 
assumptions. 

Kornhauser (5) reported intercorrelation of 
instructors’ ratings on 68 students who were 
rated on intelligence, industry, accuracy, co- 
operativeness, initiative, moral trustworthi- 
ness, and leadership. Intercorrelations ranged 
from .45 to .83 with a median of .69. 

Driver (2) reported a study of a ten trait 
merit rating scale from which several traits had 
been discarded because ratings were not dis- 
crete. The intercorrelations of the remaining 
ten traits (based on N’s varying from 100 to 
300) ranged from .11 to .79 with a mean of .46. 

Ewart, Seashore and Tiffin (3) reported a 
study of ratings of 1120 men on a twelve trait 
scale covering safety, knowledge of job, versa- 
tility, accuracy, productivity, overall job per- 
formance, industriousness, initiative, judg- 
ment, cooperation, personality, and health. 
Intercorrelations on the twelve trait scale 
ranged from .25 to .88 with a median of .75. 
Reliability coefficients were not reported. A 
factor analysis showed that a general factor 
(ability to do the present job) accounted for 
most of the total variance, A second, and 
oblique, factor was tentatively named “skill 
Possessed over and above the requirements for 
the specific job." The third and final factor 
(health) was discarded as an artifact. 

Bolanovich (1) made a factor analysis of 
Tatings on 143 field engineers whose work 
~ a range of abilities and character- 
vasis Y were rated on fourteen traits: 

Y; Personal appearance, punctuality,” 


thoroughness, efficiency, resourcefulness, vd 
pendability, cooperation, job attitude, tec oi 
nical ability, sales ability, organizing ability, 
judgment, and desire for self imi 
Intercorrelations ranged from .05 to .73 with 
a median of 49, A factor analysis resulted in 
Six common factors: attendance to detail, 
ability to do the present job, sales ability, eun 
scientiousness, organizing or systematic xd 
ency, and social intelligence. The corre 
tion between factors was not reported. 5 
: lies 

Intercorrelations given in the above — d 
appear to be typical of those usually repor e 
when rating scales consist of relatively sicat 
and specifically defined traits. Such wr ^ 
ness is generally believed to be advisable. Me 
example, Paterson (7) has said "each trait i 
rated should be restricted to a single € o 
activity or to the results of a single AP ge 
activity, otherwise the ratings will SM or 
biguous.” Although much can be said in ' din: 
of restricted traits, the fact remains that p 
correlations have generally been high. ela- 
hypothesis was therefore proposed that me 
tively independent factors would be pret 
if each factor consisted of a cluster of si biy 
which might logically be expected to be ug e 
correlated in the rating situation, cw p be 
major factors could logically be expecte ing 0 
relatively discrete in actuality. The testine | 
this hypothesis was a primary purpose ? 
research reported here. hat re 

The hypothesis was also proposed th? are 
ported intercorrelations between got 
spuriously high due to differences be pox 
raters in leniency and variability. The have 
sible effect of such differences seems satel 
been given consideration with respect sh 
pretation of ratings on specific employe? spect 
to have been ignored consistently with Te g 
to intercorrelation of traits. The te of 
this hypothesis was also a primary purp? 
the research reported here. 


240 


Intercorrelations in Merit Rating Traits 241 


Description of Scale 


The rating scale developed for this study 
consisted of six traits which were thought to be 
reasonably discrete plus ratings on overall 
value and potentiality. The nature of the 
scale is illustrated by the first trait: 


Work Habits. Consider initiative, industry, 
persistence, accuracy, carefulness of detail, 
orderliness, thoroughness, speed, punctuality 
and other related work habits. 


0—Unsatisfactory, does not meet job requirements 


poe average, partially meets job requirements 
4—Average, meets job requirements 

5 

p average, exceeds job requirements 
8—Outstanding, far exceeds job requirements 


Definitions of other traits used in the scale 
are given below. Descriptions of degrees were 
similar in nature to those given under Work 
Habits. 


. Attitudes. Consider employee's enthusiasm, 
interest, loyalty, willingness to cooperate, team- 
work, open mindedness, tolerance, ability to 
accept criticism, and other such attitudes 
toward the job, fellow workers, supervisor, and 
company. . 

Acceplance by Others. Consider whether this 
employee is liked and respected by co-workers 
and public, how they react to his appearance, 
Voice and manner, tact, courtesy, agreeable- 
ness, friendliness, etc. 

. Self Control. Consider his emotional matu- 
rity and balance, disposition, stability, calm- 
ness, maturity of action, etc. | 

Mental Ability. Consider his alertness, rea- 
Soning ability, common sense, memory, judg- 
ment, analytical ability, ability to express self 
orally or in writing, mechanical intelligence, 
ease of learning new work and remembering 
instructions, and other similar aspects of in- 
tellizence. " 

, Physical Ability. Consider his agility, coor- 
ination, dexterity, vision, hearing, endurance, 
energy, strength, overall health, etc. — , 

Overall Rating. Decide whether he is cur- 
rently an asset or liability to your department, 
Considering all evidence (whether mentioned 
in above characteristics or not) which affects 
his all-around performance at the present time. 

otentiality. To what extent does this ome 

Ployee possess capacity for future growti 

Onsider his all-around performance at the 
Present time in light of his age, health, experi- 
ence, ete., and the extent to which these and 


other factors will foster or prevent future 
development. 


Method 


Instructions to raters required them to 
decide to what degree the employee possessed 
the trait under consideration and to encircle 
the number which best described him with re- 
spect to the position he held. Ratings on all 
traits were on a nine point scale, each de- 
scription including one or two equivalent 
phrases. Descriptions of degrees were selected 
on the basis of a rough application of Thurs- 
tone’s method for scaling equal appearing 
intervals (4). 

Results reported here are based on ratings 
of 199 employees by 26 supervisors. The em- 
ployees were all hired within a six week period, 
were all on the same job, and were randomly 
divided into 26 crews having seven or eight 
employees each. The work was such that 
supervisors had considerable contact with 
their crew members on an individual basis, but 
relatively little contact with their crew as a 
unit. Ratings were obtained three months 
after employment and again five months after 
employment. Reliability coefficients for the 
eight traits were obtained from the repeated 
ratings. 

Each supervisor had rated seven or eight men 
on two occasions on eight traits. Therefore 
approximately 120 ratings by each supervisor 
were available. Inspection of these ratings 
showed that supervisors differed considerably 
both in mean rating and variability of ratings. 
On the nine point rating scale, mean ratings by 
supervisors ranged from 3.7 to 7.0. Varia- 
bility, in terms of sigma, ranged from .8 to 2.1. 
All ratings were converted to standard scores 
(having a mean of 50 and sigma of 10) on the 
assumption that differences in ratings were due 
to differences between raters rather than em- 
ployees. The further assumption was made 
that a rater who differed from others in leniency 
or variability on one trait would differ similarly 
on other traits. These assumptions appeared 
tenable in this situation. 


Results 


Intercorrelations of ratings expressed in 
standard score form are given in Table 1. 
They range from .33 to .84 with a median of 


242 C. E. Jurgensen 
Table 1 
Reliability and Intercorrelations of Ratings-Standard Score Form (N = 199 Conversion Employees) 
i J Overall Potenti- 
Jork Atti- Acceptance Self Mental Physical T eo 
Habits tades by Others Control Ability Ability Rating ality 
48.1 49.9 50.2 48.4 49.3 49.1 47.7 iu 
et 10.1 10.9 9.2 9.3 8.6 8.7 11.0 E 
Work Habits 35 
Attitudes .66 54 : 
Acceptance by Others AT 33 45 
Self Control .68 .68 53 56 , 
Mental Ability .65 .60 48 .63 33 : 
Physical Ability 55 52 AS 63 59 53 5 
Overall Rating 84 .62 52 rit .66 39 07 T 
Potentiality .63 55 49 .65 74 51 .58 ‘ 
Median 65 60 48 65 63 53 62 358 


.60. These same intercorrelations expressed 
in original raw score ratings range from .60 to 
-88 with a median of .76. Each of the 28 inter- 
correlations dropped in size when ratings were 
expressed in standard score form, the range of 
drop being from .01 to .44 with a median of .16. 

Reliabilities of ratings in standard score 
form are also given in Table 1. They range 
from .45 to .68 with a median of .55. Relia- 
bilities computed from original raw score 
ratings range from .68 to .75 with a median of 
13. Each of the eight reliabilities dropped in 
Size when ratings were expressed in standard 


score form, the range in drop being from .07 to 
-25 with a median of .18, 


Summary 


The attempt to develop a six trait scale in 
which each trait consisted of a cluster of sub- 
traits did not result in a scale wherein the 
major traits were independent of each other. 
Reliabilities and intercorrelations of these 
traits were essentially of the magnitude usually 


reported for More restricted traits. The first 
hypothesis was therefore refuted, 


weighted) are summed to obtain a ie § 
Score which is to be used for any purpose. to 
is simpler, more direct, and equally eie cdi 
obtain an overall rating instead of a ado 
based on highly correlated trait ratings. vd 
does not necessarily deny all value of Pid 
ratings. Overall ratings may be more em 
or reliable if made after consideration has es 
given to traits, even though trait E 
tions are high. Second, even though iis 
relations are high, there will be some few e 
viduals who are rated high on one trait zm 
low on another. In such cases the t 
ratings can be helpful. This appears par sis 
larly true when a primary purpose of ent ni 
to serve as a basis for a supervisor-emploY 
conference on employee progress. 

When converted to standard score form, 
tercorrelations dropped in magnitud? ce 
reliabilities dropped equally as much. f little 
fore the various trait ratings remained 0 vx 
if any value as discrete traits. ‘The GT en 
drop in reliabilities which occurred ar 
original ratings were converted to hp. 
Scores indicates that reliability coefficients not 
be spuriously high if obtained from da roi 
equated for differences in mean and sigma nesis 
one rater to another. The second hypot esti- 
was therefore confirmed. Unless an a an! 
gator gives evidence that he has considere nces: 
eliminated the effects of any such ate ad 
his reliability coefficients should be ques 
to the point of rejection. 


in- 


Received October 24, 1949. 


3 


Intercorrelations in Merit Rating Traits 243 


References 


1. Bolanovich, O. J. Statistical analysis of an indus- 
trial rating chart. J. appl. Psychol., 1946, 30, 
23-31. 

2. Driver, R. S. A case history in merit rating. Per- 
sonnel, 1940, 16, 137-162. 

3. Ewart, E., Seashore, S. E., and Tiffin, J. A factor 
analysis of an industrial merit rating scale. J. 
appl. Psychol., 1941, 25, 481-486. 


4. Guilford, J. P. Psychometric methods. New York: 
McGraw-Hill, 1936, 440-444. 

5. Kornhauser, A. W. A comparison of ratings on 
different traits. Personnel J., 1927, 5, 440—446. 

6. Mahler, W. R. Some common errors in employee 
merit rating practices. Personnel J., 1947, 26, 
68-74. 

7. Paterson, D. G. Principles of merit rating. Per- 
sonnel Digest, 1944, 1, 16-20. 


How Readable Are Corporate Annual Reports? * 
l Siroon Pashalian 


New York University 


and 


William J. E. Crissy 
Queens College 


Business enterprises have long been con- 
cerned with communication problems. Today 
there is increasing interest in how best to “get 
the word around” to jobholders, shareholders, 
customers, and the general public. One of 
the first formal media of communication to be 
used was the corporate annual report. In 
recent years an extensive literature has de- 
veloped concerning how best to construct and 
publish such reports (1, 2, 7, 8). Most of the 
papers have reflected judgments of varying 
degrees of expertness rather than findings based 
upon experimental research. 

A current problem in the construction of the 
annual report is that of endeavoring to make 
it more understandable and more widely read. 
Readership surveys show a considerable apathy 
to company reports. Yet, it is by no means 
an easy task to prepare adequate and concise 
reports. The challenge is one of presenting 
sufficient technical data and information for 
the financial expert, satisfying the require- 
ments of the law (especially in railroad and 
public utility reporting), and at the same time, 
being meaningful to those whose interests are 
of a more general nature. The present de- 
mand crystallizes as the need for writing an 
informative account of the year’s operations, 
and for presenting the material in such a way 
that the report will be read. 

In connection with this problem, one of the 
writers undertook to investigate the readability 


* This paper is a report oi i 
i n some of thi 
Pashalian's M.A. thesis entitled, “An In ace ne 


the Application of the New F ili 

mulas to Corporate And eem embed d e 
cpartment of Psychol 

and Science at New York Univ 


bility have ng een 
DE ot b 
limitations. reported 


of corporate annual reports by means of the 
new “Flesch Readability Formulas” (5, 5a). 


Method 


The annual reports of those corporations 
that are listed in the Corporate Billion-Dollar 
Club in the June 11, 1949 issue of Business 
Week were included in the present study. 
These members are either non-financial com- 
panies with assets of over $1-billion, or with 
annual revenues or sales of over $1-billion, or 
both. Presumably then, these corporations 
have the largest number of stockholders, em- 
ployees and other persons interested in their 
operations. In other words, there is substan- 
tial public interest in these big corporations; 
their annual reports are expected to reach à 
vast audience. 

Applying the sampling technique suggested 
by Flesch (3, 4, 5), one-hundred-word samples 
were chosen from every other page of each 9 
these twenty-six reports. This procedure: 
with no restrictions on the number of samples 
to be taken per report, is believed to ee 
achieved a fair sampling in proportion to th 
length and breadth of the report. A total A 
211 samples were examined; the averag 
number of samples per report was 8.1. 


Results 
e new 


The findings on the application of th “ak 


Flesch Readability Formulas to the twenty 
annual reports are listed in Table 1. 


Analysis of the Reading Ease Measures 
om 6 


= ras fr 
The range of readability scores W cate- 


to 58. According to Flesch reference d 
gories for these scores (5, 5a), these m B 
vary within descriptive styles of very di 


244 


Li: 


How Readable Are Corporate Annual Reports? 245 


as with material of scientific and professional 
Journals, to fairly difficult, as in literary and 
quality magazines, such as Harper’s. This 
range, interpreted in terms of the educational 
attainments of the U. S. adult population, 
suggests a potential audience of from 42% of 
the population completing college, to 40% of 
the population who have had some high school 
education (4). 

The average Reading Ease score for the 
entire set of reports was 34.37. Writing at this 
level is generally difficult, and descriptive of the 
style in academic material; for example, the 
Vale Review, which may be comprehended by 
24% of the population who have graduated 
from high school or have had some college 
training. 

The measure of average sentence length 


ranged from 16—fairly easy, typical of slick- 
fiction and understandable by 80% of the 
population, to 53—very difficull, above much 
scientific material and understood by approxi- 
mately 43% of the population. 

The measure of the average number of syl- 
lables per 100 words ranged from 156—fairly 
difficult, to 183—difficult. Flesch has advised 
that a comfortable text contains one and one- 
half times as many syllables as words (6). 
The preponderance of reports in the difficult 
category on this measure (22 of the 26 reports) 
is indicative of a high level of abstraction in 
the language of these reports. 

In relation to this syllable measure, the 
factor of numbers was encountered in the 
material of the annual reports. Under Flesch’s 
directions, numbers separated by space are 


Table 1 
Summary of Reading Ease and Human Interest Measures on the Twenty-six Annual Reports 
Average Average Feeding peceni Human 
Industry te Annual Sentence Number Ease of Personal Interest 
Average 33 mag ort Length Syllables Score Words Sent. Score 
Merchandise Sears, Roebuck 22 162 47 2 0 7 
44.50 Montgomery Ward 16 175 43 0 0 0 
Communications Bell Telephone 24 165 43 1 0 4 
43.00 ] 
Foods Swift & Co. Zo 156 58 5 2 19 
43.00 Armour & Co. 21 168 43 4 0 15 
Safeway Stores 27 182 28 2 0 7 
Autos & Accessories General Motors 21 174 38 2 0 7 
, 40.00 Chrysler Corp- 20 171 42 1 0 4 
o Standard Oil (N. J.) 21 179 3 0 0 0 
33.50 Standard Oil (Ind.) 25 173 35 2 0 s 
Socony-Vacuum 21 180 33 2 0 7 
Texas Co. 23 182 32 1 0 4 
Gulf Oil 24 175 34 1 0 4 
Standard Oil (Cal.) 24 178 A : 0 0 
Utilities Consolidated Edison 27 166 3 0 4 
31.50 Commonwealth-Southern 31 183 24 0 0 0 
Railroads Pennsylvania 27 1 70 E : 0 4 
28:50 NY Central 25 174 34 i 0 i 
Southern Pacific 32 175 6 0 
Santa Fe 32 177 25 1 0 4 
Baltimore & Ohio = i E E 9 7 
Union Pacific E yi 
Machinery & Supplies General Electric 30 178 26 0 0 0 
6.00 s 
Metals & Chemicals U. S. Steel ae ud pa i o 4 
4867 E. I. du Pont 31 1 5s T : E 
Bethlehem Steel 39 ^ 


246 


counted as words ir any text; several and 
lengthy figures should be omitted from the 
syllable count. Instead, a corresponding num- 
ber of words to the number of figures omitted 
should be added, and their syllable totals 
added to those already counted (5). When 
applying these directions to the samples, the 
number of figures per 100 words was also re- 
corded. The average number of figures per 
100 words ranged from 2.30 to 8.80. The 
highest number of figures per 100 words ranged 
from 5 to 21. Thus, a surprisingly large num- 
ber of figures appeared in small samples of 

100 words. Their disastrous effect on the 
general reader not widely trained in numbers 
or mathematics invites speculation. It would 
seem, therefore, that greater care and attention 
should be given to determining best ways of 
presenting figures in such reports. Hundred- 
word samples that are crowded with 10 to 20 
lengthy figures should caution writers or 
editors, and suggest their more effective incor- 
poration in a table or chart. 

The significance of these results on reada- 
bility obtained by analysis utilizing the Flesch 
technique is perhaps best indicated by showing 
what would improve the “scores” made. A 
need is demonstrated for more effective use of 
punctuational devices. The semicolon, for in- 
Stance, can be more widely used to shorten 
sentence lengths and at the same time, to 
maintain any indications that the words and 
information belong closely together, 

Similarly, the need for writing at a less 
difficult level is indicated, A “writing down” 
would not necessarily be an under-estimation of 
intelligence. People whom corporations want 
to influence probably range from low normal 
Intelligence to the superior. They have the 
capacity to grasp such concepts as gross sales, 
Sure evum, the vast audience 

ports reach is assumed by cor- 

porate reporters to possess far greater language 
facility than it does, The lan; ids 
guage of these 

of the education of the 
above, is too difficult 
s diversified readership 
orate writers are over- 


Siroon Pashalian and William J. E. Crissy 


Human Interest Value 


The range of Human Interest scores was 
from 0 to 19. Again, in terms of Flesch refer- 
ences (5, 5a), these styles are from dull, de- 
scriptive of the style in scientific journals, to 
mildly interesting, descriptive of trade maga- 
zines. The average Human Interest score for 
the entire set of reports was 4.27—dull. 

Thus, the coporate writing of these twenty- 
six annual reports is extremely low in human 
interest value, i.e., in the personal words and 
sentences which provoke and continue general 
reader interest, and help the reader to under- 
stand the text better. In an era which is 
fostering the teamwork and cooperation of 
stockholders and employees alike, the need for 
the stress on “we” and "our" and "us" is 
inescapable. ^I" can serve to bridge the 
tremendous gap between the President and 
stockholders and employees. This set of re- 
ports used such words sparsely. Personal 
words help to convey the feeling in the material 
of having been written directly to the reader; 
whoever he may be. They can reflect the 
whole spirit and tone of the organization. | 

In addition, corporate writing can direct 
greater attention to individual personalities 
Although the sampling necessarily tapped only 
certain pages, extremely few samples men- 
tioned specific persons and their accomplish- 
ments. It appeared that much of this kind © 
information was confined to Employee Rela- 
tions headings or obituaries. People are ane 
terested in people, and they want to pecori 
better acquainted with the outstanding pa 
sonalities of the corporation. Yet, among ua 
21,100 words sampled in this study, only °F 
proximately twenty names were menti d 
and even these were noticeably concentrat 
within certain reports. lue; 

Moreover, to enhance human interest V? d 
there remains the need for the appropriate ^. 
of personal sentences—exclamations, que 
and commands directly expressed to the sie 
Only one report among the twenty-six pos ling 
a sentence of this description in the er 
Scheme—a question. More question-ma^- 
annual reports can provoke thought, ur jrect 
or revive reader interest. Similarly; oling 
commands are another interest-contr™ 
device. Instead of the impersonal, It y 


kh 


How Readable Are Corporate Annual Reports? 247 


noted... ,” "Note . . .” can do much more 
to invoke the effort of a glance at the charts 
and an independent analysis. 


Sample Passages 


To illustrate the various levels of difficulty 
obtained by means of the new Flesch Read- 
ability Formulas, sample passages from the 
annual reports of those corporations which 
ranked twenty-sixth, thirteenth, and first in 
Reading Ease are furnished below: 

From the Union Pacific Railroad report: 


Capital Stock 


At the annual meeting of Union Pacific Rail- 
road Company stockholders held on May 11, 
1948, the Articles of Association were amended 
so that on July 1, 1948 the total number of 
authorized shares of preferred and common 
Stocks of the Company were doubled (with no 
Increase in the total aggregate par value 
thereof), and the then outstanding 995,431 
Shares of $100 par value preferred stock became 
1,990,862 shares of $50 par value preferred 
Stock, and the then outstanding 2,222,910 
Shares of $100 par value common stock became 
1,445,820 shares of $50 par value common 
Stock, each of the new $50 par value preferred 
and common shares being entitled to one vote 
at any meeting of stockholders. 


From the New York Central Railroad report: 


Dieselization is progressing 
Carrying forward our motive power moderni- 
zation, the Central and leased lines, together 
With two affiliates, the Pittsburgh & Lake Erie 
and the Indiana Harbor Belt Railroads, ordered 
in 1948 new Diesel-electric locomotives at a 
total cost of approximately $33,600,000. The 
bulk of these locomotives, on which deliveries 
Vill extend into 1950, are for road freight and 
Or switching service. The Central's portion 

Was about $24,790,000. . 
ing 1948 increased 


Ocomotives delivered duri 1 
he Dieselized portion of the total road freight 


train mileage of the Central and leased lines to 
Approximately 13.5 per cent by the end of the 
year, 3 


From the Swift and Company report: 


What Swift & Co. is Trying To Do 


The public rightly expects a business to 
accomplish certain desirable things. , 
© determines what is desirable in a free 
fountry? Not one man or a group of men. 
"ill individual decides for himself whether he 
Co buy from, sell to, work for, or invest 1n à 
mpany, 


A decision to buy a product is a vote in its 
favor. The votes of millions of people may 
cause prices to go up or down. The results 
quickly tell what the public thinks desirable. 

Such economic democracy can thrive only in 
a certain climate—one in which prices are free 
and competitive, and business is spurred by 
the hope of profits and the fear of losses. 


Analysis by Industries 


The arrangement of the twenty-six reports 
by industry in Table I facilitates interesting 
and noteworthy comparisons. There seems 
to be a certain amount of homogeneity within 
industries on all the obtained measures of the 
Flesch formulas. At the same time, however, 
it must be remembered that the entire set of 
reports has demonstrated a great degree of 
homogeneity and narrow range under the 
Flesch technique. 

The reports of railroad companies have the 
greatest amount of variability on the measures 
employed. Their range in average sentence 
length is from 18 words (standard) to 53 words 
(very difficult). This observation seems to re- 
flect the great difficulty attached to railroad 
reporting due in part to legal specifications con- 
cerning content. Apparently, however, some 
railroad companies are fulfilling their legal and 
public obligations in a more effective manner 
of readability than others. 

Another striking inference may be obtained 
from Table 1. The arrangement by rank order 
in readability attainments under the formulas, 
parallels almost directly the degree of contact 
the companies have with the general public. 
Merchandising, Communications, Foods, Auto- 
mobiles and Accessories corporations cater to 
larger sections of the general public. Cor- 
porations dealing in Machinery and Supplies, 
Metals and Chemicals have a more restricted, 
less diversified market for their products (8). 
Since this observation is based on the small 
and variable groups of the study, however, 
more data and analysis are actually required 
for final proof. 

Nevertheless, such an arrangement is by no 
means unwarranted. It is generally accepted 
that the extent and character of the public 
interest should first be determined in the con- 
struction of the report. Then a corporate 
vriter attempts to write to that audience. 
However, if this same trend had appeared 


248 Siroon Pashalian and William J. E. Crissy 


within lower degrees of difficulty, it could be 
considered a more legitimate consequence of 
the nature of the enterprise and the groups 
interested in its operations. At the same time, 
jt would then meet the readability require- 
ments of these particular audiences. 


Summary 


1. Analysis of the readability of the twenty- 
six annual reports of corporations listed in the 
Billion-Dollar Club of Business Week, June 11, 
1949, by means of the new Flesch Readability 
Formulas, revealed that, on the whole, the 
general level of reading was difficult, and the 
human interest value dull. 

2. These reports contain language which is 
beyond the language experience and fluent com- 
prehension of approximately 75% of the U. S. 
adult population. 

3. The Flesch technique demonstrates prom- 
ise as a method for indicating the difficult 
language elements in corporate reports. 

4. It also demonstrates promise as a method 
for spotting “‘impersonalness” in such writing. 


When, as in this study, the writing sample 
involves a problem of mass communication, 
the Flesch technique appears to be a reasonable 
instrument. It gauges the likelihood that 
these annual reports will convey their messages 
to most of their prospective readers. Wider 
application of the technique in the construction 


of the annual report is recommended. Used in 
conjunction with the other types of practical 
hints in the literature, it can serve to strengthen 
the annual report as the most important single 
written communication between management 
and stockholders, employees, and the general 
public. 


Received October 7, 1949. 


References 


1. Dale, E. Preparation of company annual reports. 
Research Report No. 10, American Management 
Association, New York, 1946, 104 pp. 

2. Doris, Lillian. Modern corporate reports. New 
York: Prentice-Hall, 1948. 

3. Flesch, R. F. Marks of readable style; A study in 
adult education. New York: Bureau of Publi- 
cations, Teachers College, Columbia University, 
1943. (Contributions to Education, No. 897.) 

4. Flesch, R. F. The art of plain talk. New York: 
Harper and Brothers, 1946. 

5. Flesch, R. F. A new readability yardstick. J. 
appl. Psychol., 1948, 32, 221-233. 

5a. Flesch, R. F. The art of readable writing. New 
York: Harper and Brothers, 1949. 

6. Flesch, R. F. Making the narrative readable. 
Chapter 15 in Modern Corporate Reports by 
Lillian Doris, New York: Prentice-Hall, 1948. 

7. Gibson, W. B., ed. The annual report; A study ° 
over 500 financial reports of leading American 
Business Institutions showing the present sty’ e 
trend and important physical characteristics. 
Chillicothe, Ohio: Mead Corporation, Marketing 
Research Division, 1939. 

8. McLaren, N. L. Annual reports to stockholders” 
Their preparation and inter pretation. New York: 
Ronald Press, 1947. 


Rorschach Responses, Strong Blank Scales, and Job 
Satisfaction Among Policemen 


Solis L. Kates 
Michigan State College 


An individual with certain personality char- 
acteristics, just short of thoroughgoing psy- 
chotic disorganization, may be attracted to and 
Satisfied with an occupation because his per- 
Sonality traits are compatible with its demands. 
Certain personality traits, whether they con- 
tribute to a high degree of adjustment or 
maladjustment may be of importance for 
or lead to the development of interest and 
Satisfaction in a particular occupation. For 
Instance, many of the employed patients seen 
at a mental hygiene clinic gave Rorschach 
responses that were markedly abnormal yet 
they were able to carry on satisfactorily in 
their occupations. 

The behavior of an individual should be 
Studied not only in relation to the total culture 
but with regard to his sub-cultural group and 
its demands upon him. When the personality 
tendencies of an individual are compatible 
with the demands of a particular sub-culture, 
Probably little anxiety or self-dissatisfaction 
may be provoked. If these personality traits 
conflict with the sub-cultural demands, prob- 
ably much anxiety and self-dissatisfaction 
May ensue. Because occupational groups are 
readily available for study, they have been 
Selected here as a particular type of sub-culture 
for scrutiny. This study proposes only to 
ascertain the personality traits that appear to 
be possessed by policemen, Many occupa- 
tions will have to be investigated to ascertain 
the personality attributes associated with mem- 
bership in the occupation. But occupational 
groups must be carefully selected and the sub- 
Jects identified with respect to the homogeneity 
of their vocational interests. It is interesting 
to note that studies in the field of vocational 
Choice and achievement have been suggested 

Y Rapaport (9) as a method of validating the 

Orschach Test. 3 

The hypothesis that a significant relation- 
Ship exists between measured vocational in- 
terests and job satisfaction postulated by 


249 


Strong (11) requires substantiation. An in- 
vestigation recently completed by the writer 
(3) did not demonstrate any significant rela- 
tionship between measured vocational interests 
and job satisfaction of routine office workers. 
More evidence is essential with respect to this 
hypothesis. 

Subjects: The subjects of this study were 
twenty-five New York City Patrolmen who 
volunteered while off duty for testing. All 
came from one precinct. Their average age 
was 32.8 years and their mean educational at- 
tainment was 12.2 years. Due to the small 
sample, the results discussed below must be 
considered as suggestive rather than con- 
clusive. 

Method: The Rorschach Test was adminis- 
tered in groups of four or five and an individual 
inquiry was conducted with each subject. 
The subjects completed the Strong Vocational 
Interest Blank and a job satisfaction blank of 
the Hoppock type. The Rorschach responses 
were scored as suggested by Klopfer (4) and 
were evaluated by means of the Munroe In- 
spection List (6) which yielded a total score 
indicative of the degree of maladjustment. 
The subjects’ responses were appraised ac- 
cording to the extent of their deviation from 
clinically established normal limits in terms 


of 28 Rorschach categories. 


Results 


The mean policemen interest score for the 
subjects was 40.6 and the standard deviation 
was 10.1. The difference between this mean 
interest score and that for Strong’s criterion 
group was significant at the one per cent level. 
However, when the scores were divided into 
two categories corresponding to A and B+ 
ratings and those B and below in ratings, this 
distribution was not significantly different from 
the expected distribution of scores according 


to Strong’s data (11). 


250 


Solis L. Kates 


Table 1 


Product-Moment Intercorrelation Among Police Interests, Job Satisíaction, Occupational Level, 


and Munroe Inspection Technique Scores 


Police Job Occupational Munroe 

Interests Satisfaction Level Scores 
Police Interests = 35 —.74** 19 

Job Satisfaction 35 — —.51** aAT* 
Occupational Level —.74** —.51** — —.26 
Munroe Score 19 AT* —.26 a 


* Significant at five per cent level. 
** Significant at one per cent level. 


The mean job satisfaction score of the police- 
men was 20.0, with a standard deviation of 3.2, 
while the mean Munroe Inspection score was 
11.9 and a standard deviation of 3.7. The 
mean occupational level score on the Strong 
blank for the policemen was 48.9, standard 
deviation 5.7, which mean was not significantly 
different from that of Strong's criterion group. 

Table 1 lists the product-moment correlation 
coefficients between the variables under con- 
sideration, namely, job satisfaction, occupa- 
tional level, measured police interests, and 
Munroe Inspection scores. 

The biserial correlation coefficients between 
some of the Rorschach response categories on 
the Munroe Inspection List and job satisfac- 


Table 2 


Biserial Correlation Coefficients Between Munroe 
Inspection List Categories and Police 
Interests, Job Satisfaction, and 
Occupational Level 


Job Occupa- 

Rorschach Police Satis- tional 
Score Interests faction Level 

Color: Movement. ior .03 —.47* 
CF:FC —.25 —.02 he 

FC —31 —.03 .61* 
:M Sl .00 .00 

.61* .00 —.52* 

Total Color .55* .52* - on 
m —48 .00 — 235 
o .25 .20 —.30 
a —.07 —.51* 29 

d 4 1 
W:M R .29 95. 
W:M(M>W) — ex A 


* Signi 
ignifi iseri: i 
gmificant biserial correlation coefficients. 


tion, police interests, and occupational level 
are listed in Table 2. 

In Table 3, the respective quantities of the 
whole (W) and human movement (M) responses 
are shown that have been arbitrarily considered 
as the optimum ratios. 1f the number of whole 
(W) responses was smaller than its lower limits 
for the specific number of human movement 
(M) responses given in the protocol, the ratio 
was considered to be excessively in favor of the 
human movement (M) responses. Tf the num- 
ber of whole (W) responses was greater than 
its upper limits in relation to the human move 
ment (M) responses then the ratio was COT 
sidered to demonstrate an excessive prepo” 
derance of whole (W) responses. 


Discussion 
^ " i nifi- 
The mean score of the subjects was ER 
, 
cantly lower than that of Strong $ em 


"HM x A esse! 
group, indicating that many subjects poss 


Table 3 


5 
pons® 
Arbitrary Optimum Numbers of Whole (W) Res 
in Relation to Human Movement 


(M) Responses ae 
— CURT 
Numbers of Human Arbitrary OP hole 
Movement (M) Numbers © onses 
Responses (W) Resp? 
1 2104 
2 3 to8 
3 5 to 12 
A 6 to 15 
g to 15 
3 9 to 15 
E 10 to 16 
s 41 to 18 


Rorschach Responses, Strong Blank Scales, and Job Satisfaction 


extremely low measured police interests. On 
the other hand, the fact that the proportion 
of A and B+ interest ratings was as expected 
in terms of Strong's criterion group illustrated 
that letter ratings were probably of greater 
accuracy than the magnitude of interest scores 
in forecasting the composition of individuals 
within an occupational group. Furthermore, 
letter ratings should be given more value than 
the magnitude of standard scores in guiding 
individuals. Finally, it may be concluded 
that many subjects entered police work be- 
cause of their genuine interest while the 
others had different reasons. 

The degree of job satisfaction was signifi- 
cantly greater than that of routine office clerks 
(3), significantly lower than that of nursing 
Students (7), and slightly but insignificantly 
lower than that of engineering students (1). 
There was probably greater opportunity for 
Dolicemen to express their abilities and pre- 
dilections in the pursuit of their duties than 
existed for routine clerks. The factor of 
greater remuneration of the policemen was an 
important element that could not be evaluated. 

The degree of maladjustment of the police- 
men, as measured by the Rorschach Test was 
Slightly but insignificantly smaller than that of 
routine clerks (3) and slightly but insignifi- 
cantly greater than that of biologists (10). 
Probably, the policemen, as a whole, demon- 
Strated as many signs of maladjustment as may 
be found in other groups. 

No significant relationship was found to 
exist between the degree of measured police 
interests and satisfaction with police work. 

Onsequently, the police interest scale of the 
Strong Blank cannot be used with any degree 
of certainty in forecasting satisfaction with 
Police work. One of the reasons for this ab- 
Sence of relationship may be assigned to the 
lack of precision of the measuring instruments. 

Moreover, Strong in setting up his occupa- 
tional scales, did not attempt to differentiate 
the satisfied from the dissatisfied members of 

€occupation. It was unwarranted to assume 
that those with the higher interest Scores 
Would be more satisfied than those with the 

OWer scores, Similarly, since there was no 
Significant relationship between job satisfaction 
and police interest ratings when the latter were 
Separated into A and B+, and B and lower 


251 


ratings, letter ratings may not be utilized as a 
predictor of work satisfaction. i 

Another reason for this absence of a signifi- 
cant relationship between job satisfaction and 
measured interests was the fact, as Strong 
indicated, that for some occupations, the weed- 
ing out process is not thoroughgoing and the 
elimination of candidates is not in terms of 
their personal attributes. Hence, the members 
of such occupations may not be homogeneous. 
What would satisfy one need not satisfy an- 
other member of the occupation. The more 
homogeneous the members, the more will 
similar occupational pursuits satisfy the mem- 
bers. The greater the homogeneity of mem- 
bers of an occupation, the greater and more 
significant will be the relationship between 
measured interests and job satisfaction. The 
less homogeneous the members, the smaller 
will be the relationship between measured in- 
erests and job satisfaction. Policemen may 
not be as homogeneous as, for example, phy- 
sicians and engineers and hence the relationship 
between their measured interests and job 
satisfaction may not be as great as that be- 
tween physicians and engineers. Further- 
more, in vocations where personal skills are 
required and the tasks are challenging, it is 
probable that the correspondence of measured 
interests to those of other successful members 
of the occupation would be significantly associ- 
ated with work satisfaction. [A more extensive 
treatment of this hypothesis can be found in 
the monograph by Kates (3).] 

Finally, the unlikely possibility must be 
kept in mind that measured interests may 
change with experience on the job. Probably, 
the stability of measured interests goes hand in 
hand with the homogeneity of members of an 
occupation by being related as well to the 
extent and character of the elimination process. 
Where the weeding out process is extensive and 
is based on personal attributes, the measured 
interests of the members of the occupation 
should be quite stable. Where the elimination 
process is not so extensive and is not based on 
personal attributes, measured interests may be 
subject to change by further experience on 
the job. 

The significant negative relationship between 
the occupational level and police interests is 
quite similar to that found by Strong. For 


252 Solis L. 
those policemen with high measured interests, 
the occupational level, probably indicative of 
vocational aspiration level (2), was low. 

The significant negative relationship between 
job satisfaction and occupational level lent 
further support to the acceptance of the occu- 
pational level scale as an indicator of the level 
of aspiration in vocational endeavor. The 
policemen who had high occupational levels of 
aspiration were prone to be dissatisfied with 
their jobs. It illustrates the fact that in police 
work, the occupational level is a more potent 
contributor to job satisfaction and dissatis- 
faction than measured police interests. Prob- 
ably, it may be stated that, for policemen, job 
satisfaction is not as intimately linked with 
measured police interests as it is related to 
their occupational levels of aspiration. The 
subjective reactions of the policemen in terms 
of their aspiration levels, as measured by the 
occupational level scores, were of greater im- 
portance to their job satisfaction than the mere 
possession of high police interests. The fact 
that the member may be similar in his meas- 
ured interests to other successful members of 
his occupation may not be as important to his 
satisfaction as the fact that he has not attained 
to a level of vocational achievement that he 
believes necessary for his success. This con- 
clusion is consistent with the experiments of 
Lewin (5) demonstrating that the level of as- 
piration is a more vital factor for satisfaction 
with achievement than is objective perform- 
ance. Probably in the higher status occupa- 
tions such as physician and lawyer, there 
would be a positive relationship between occu- 
pational level and job satisfaction. 

The more maladjusted policemen, as meas- 
ured by the Rorschach Test and Munroe In- 
spection Technique (assuming that the Munroe 
isa valid criterion of maladjustment for N.Y.C. 
policemen), tended to be more satisfied with 
"reed than s less maladjusted. Hence, 
E d C nas the hypothesis that 
tribute to his ipe adjustments might con- 
his job eet oe rather than to 
Abels ties oo mi ion. It controverted the 
e ura: T job dissatisfaction stemmed 

lim ees ual s maladjustment. 

Beo er ained that policemen with high 
- e interests could be or could not 
Adjusted. Similarly, high occupational 


Kates 


levels of aspiration were not associated with 
maladjustment. In police work, it may not 
be concluded that the individuals with the 
greater degree of vocational ambition tended 
to be more maladjusted. 

High police interest scores were associated 
with the tendency of policemen to give signifi- 
cantly more movement (M, FM, m) responses 
than color responses (FC, CF, C) to the 
Rorschach cards. They were prone primarily 
to be motivated by and responsive to inner 
promptings rather than to stimuli from without. 

High police interests were related to the 
tendency for policemen to give an adequate 
number of human movement (M) responses of 
reasonably accurate form. Low police inter- 
ests were associated with the giving by police- 
men of poor human movement (M) responses 
or two and less human movement (M) re- 
sponses. It may be said that the policemen 
who were prone to accept their inner prompt- 
ings as constructive and positive forces would 
have high police interests. 

The higher the police interests, the greater 
was the tendency for the subjects to give two 
and less color (FC, CF, C) responses. Police- 
men with high police interests did not possess 
the general readiness to establish emotiona 
relationships with the world around them. | 

In addition, the higher the job satisfaction 
score, the greater the tendency of policemen to 
give two and less color (FC, CF, C) responses: 
Satisfied policemen were not prone to establish 
a ready emotional relationship with the wor 
around them. It is probable that policeme? 
who tend to establish good social B 
would not be happy on their jobs. ne 
who as a matter of practice did not estab” ir 
good social relations would probably find the 
conditions of work rewarding. the 

High job satisfaction was related tO lat 
giving by policemen of three and less ot e 
responses. The more satisfied policemen T di 
not likely to think along conventional ir sa 

No other significant relationships were ist 
covered between the Munroe Inspection ise 
categories and police interests and Jo 
faction. ird 

The higher the occupational level of eat to 
tion, the greater the tendency for police" that 
give total color (FC, CF, C) response ment 
were equal to or greater than total mo 


Rorschach Responses, Strong Blank Scales, aud Job Satisfaction 


(M, FM, m) responses. Higher occupational 
levels of aspirations were associated with the 
policemen's responsiveness to the environment 
about them. The same trend was evident in 
the finding that high occupational levels were 
related to the giving of an adequate num- 
ber of form color (FC) responses. Rational 
emotional reactions to stimuli by policemen 
were associated with high occupational levels 
ofaspiration. Finally, high occupational levels 
of aspiration were given by policemen who had 
àn adequate number of color (FC, CF, C) re- 
sponses demonstrating further that the ability 
to be responsive to environmental stimuli went 
hand in hand with high vocational aspiration. 
It does sound reasonable and valid to have 
policemen with high occupational levels of 
aspiration display characteristics that are rec- 
Ognized as necessary to achieve higher economic 
Status. In another study, a definite relation- 
Ship has been demonstrated to exist between 
great energy and activity in out-of-school and 
co-curricular activities of high school boys, and 
high occupational level scores (8). Since re- 
Sponsiveness to outside stimuli is self-evident 
as a trait for carrying on co-curricular and out- 
of-school activities, one of the results of this 
Study is to furnish additional evidence vali- 
dating the meaning of the Rorschach color 
responses. 
Low occupational levels of aspiration were 
associated with adequate human movement 
3 ) responses of policemen. High occupa- 
tional levels were related to less adequate 
numbers of human movement (M) responses 
and poorly seen human movement (M) re- 
Sponses, ‘The more nearly the ratio, whole 
(W) responses to human movement (M) re- 
Sponses, approached an arbitrary optimum, the 
lower the occupational level of policemen. 
When the number of human movement (M) 
responses in relation to whole (W) responses 
Was in excess of the arbitrary optimum, the 
Occupational level of the policemen tended to 
be high. 
Tt may be concluded that when policemen 
had an adequate number of human movement 
responses in terms of both the Rorschach 
Protocol and whole (W) responses, they were 
Mclined to accept their inner promptings and 
their present status in life. They were not 
Prone to aspire to higher occupational levels as 


253 


they had probably accepted as suitable their 
own outlook on life as well as the realization 
that their attainments were consonant with 
their ability. Policemen with inadequate 
human movement (M) responses tended not to 
accept their status in life, and appeared to have 
a need to aspire to higher occupational levels. 
The same conclusion applied to a preponder- 
ance of human movement (M) responses in 
relation to whole (W) responses with the prob- 
able additional meaning that the excessive 
inner strivings within the policemen drove 
them toward higher occupational levels. 


Summary 


The subjects of this study, New York City 
policemen, were found: (a) to demonstrate no 
significant difference from Strong’s criterion 
group in measured police interest ratings, (b) 
to have a relatively high job satisfaction level, 
and (c) to be no more maladjusted than routine 
office clerks and biologists. 

No relationship was discovered between the 
subjects’ measured interests and job satis- 
faction, between measured interests and Ror- 
schach maladjustment, and between occupa- 
tional level and Rorschach maladjustment. 
Significant relationships were found to exist 
between measured police interests and occupa- 
tional level, between job satisfaction and 
occupational level, and between job satis- 
faction and Rorschach maladjustment. In 
these conclusions, the assumption was made 
that the Munroe Inspection Technique is a 
valid criterion of maladjustment for New York 
City policemen. Several reasons were ad- 
vanced for the presence and absence of the 
above relationships. 

Additional support has been received from 
the results of this study to validate the meaning 
of the color (FC, CF, C) responses. The 
human movement (M) responses have been 
assessed in a different context and additional 
meanings have been attached to the relations 
of the human movement (M) response to the 
whole (W) response and to the total Rorschach 
protocol. 

Investigations of policemen who are under- 
going treatment as a consequence of emotional 
maladjustment should be of help in ascertain- 
ing whether the hypothesis obtains, namely 


254 


that the personality traits of the particular sub- 
cultural group with which the patient was 
identified, should be known to evaluate his 
relative degree of maladjustment. 

Finally, policemen with high police interests 
tended to be markedly introversive, to have 
adequate ability to accept their own strivings 
and outlook as mature, and to be relatively 
unresponsive to stimuli from without. Satis- 
fied policemen tended to be relatively unre- 
sponsive to stimuli from without and to be 
lacking in the capacity to think along conven- 
tionallines. Further studies of the personality 
of policemen, if they confirm the above results, 
will be instrumental in helping to validate the 
significance of some Rorschach responses. 


Received Oclober 17, 1949. 


References 


1. Berdie, R. E. The prediction of college achieve- 


ment and satisfaction. J. appl. Psychol., 1941, 
25, 197-204. 


2. Darley, J. G. Clinical aspects and interpretation of 


Solis L. Kates 


the Strong Vocational Interest Blank. New York: 
The Psycholcgical Corporation, 1941. 

3. Kates, S. L. Rorschach responses related to voca- 
tional interests and job satisfaction. Psychol. 
Monogr., 1950, 64, 3 (Whole No. 309). 

4. Klopfer, B., and Kelly, D. The Rorschach tech- 


nique. New York: World Book Co., 1942. 
5. Lewin, K., Dembo, Festinger, L., and Scars, 
P. S. Level of aspiration. In J. McV. Hunt 


(Ed.) Personality and the behavior disorders. 
New York: Ronald Press, 1944, Vol. 1, 333- 
378. 

6. Munroe, Ruth L. The inspection technique: y 
method of rapid evaluation of the Rorschach 
protocol Rorschach Res. Exch., 1944, 8, ere 

7. Nahm, Helen. Satisfaction with nursing. J. app 
Psychol., 1948, 32, 335-343. Test 

8. Ostrom, S. R. The OL key of the Strong Tes 


and drive at the twelfth grade level. J. appl: 
Psychol., 1949, 33, 240-248. i stic 
9. Rapaport, D., Gill, M., and Shafer, R. Diagnos 


psychological testing. Chicago: Year Book Pub: 

lishers, Inc., 1946. Vol. IT. vile 
10. Roe, Anne. Analysis of group Rorschachs m 

gists. Rorschach Res. Exch., 1949, 13, 25 eer 
11. Strong, E. K., Jr. Vocational interests of s 

women. Stanford, California: Stanford U. 

1943, 


aaa 


Card Versus Booklet Forms of the MMPI 


Wm. C. Cottle 


The University of Kansas 


The purpose of this article is to compare the 
Card versus the booklet form of the Minnesota 
Multiphasic Personality Inventory (MMPI) 
to determine whether it would be possible to 
use these forms interchangeably in a testing 
Program. Is it possible to consider them as 
comparable forms of the test? 

Wiener (1), alternating the card and booklet 
form in chronological order with each new 
Counselee at the St. Paul Guidance Center, 
found that “there are no differences between 
the individual and group forms on any scales 
that approach statistical significance." 

Holzberg and Allessi (4) using 30 psychiatric 
Patients tested within one or two days on the 
full and abbreviated form of the MMPI, report 
that “although statistically significant differ- 
ences were found between the mean weighted 
Scores of half the scales, these results were not 
Clinically significant as judged by profile 
results.” 


The Sample 


_ Abrief description of the sample of 100 cases 
Included here is given in Table 1. It will be 
Noted that the educational level of the group is 
relatively high since all are college students, 
With approximately half the group under- 
Staduates and the rest graduate students in the 
School of Education at the University of 
Kansas. Modal school grade attained by this 
Stoup was first year of graduate study. 

The group volunteered to take both forms 
of the test and no pressure was exerted to 
Secure compliance.! As evidence of this only 

00 of the proposed group of 175 completed 
both forms of the test within the one week 
time-limit imposed. i 

Age of the group ranged from 18 to 60 with 
a median age for the 68 males of 27.8 years 
and for the 32 females of 21.5 years. 


"Acknowledgment is made to Professors E. T. Gaston, 
bé Turney: J. F. Nickerson, G. M. Carney, and 
- 9. Powell of the University of Kansas for making i 


Possible for their students to take these tests. 


255 


An undergraduate group of music education 
students contained 23 males and 19 females. 
In a subgroup of music education teachers 
taking graduate work there were 15 males and 
four females. In a subgroup of miscellaneous 
teachers and public school administrators there 
were 30 males and nine females. These com- 
prise the total group of 68 males and 32 females. 


Table 1 
The Sample 
A Age Male Female 
18-20 7 d$ 
21-25 17 10 
26-30 18 2 
31-35 8 0 
36-10 10 a 
41-45 3 3 
46-50 4 0 
51-55 1 0 
56-60 0 1 
Total 68 32 
B. School 
Grade Male Female 
13 9 10 
14 1 
15 13: * 9 
17 36 13 
18 T 
19 2 
Total 68 32 
G. Occupation Male Female 
Mus. Ed. Student 23 19 
Mus. Tchr. 15 4 
Misc. Tchr. 30 9 
Total 68 32 
D. Card Form 
Taken 
Male Female 
First 34 20 
Last 34 12 
Total 68 32 


256 Wom. C. Coltle 
200 
Table 2 
M and Standard Deviations of Card vs. Booklet Form of MMPI for 100 College Studen 3 : 
eans : 
~ 3 Ma 
1 k K F Hs D Hy Pd Mf Pa Pt Sc = 
Scales 2 IE = 
7 491 788 891 7. 5 
X 53 1537 272 388 1684 1888 1334 24 8 a 
93 : pe 499 234 357 3.94 515 432 540 2.72 699 646 : 
Ar o l » 
3 82 15. 
8 Males X 2.64 15.81 3.56 3.99 18.54 19.71 14.19 26.00 8.84 10.18 : e 
i Booklet v 125 449 278 362 46 5.00 441 562 2372 78 š a 
46 14.03 
32 Females 2.56 1544 244 525 17.87 2144 1313 3609 822 na E D 46 
Card e 122 $06 209 478 116 519 431 541 253 743 2 n 
r 5 9.00 15. 
32Females X 2.91 15.97 247 5.13 1863 21.63 1378 3747 872 1109 eed 238 
Booklet o 131 494 225 5.00 5.06 566 475 444 268 8.16 E ea 
7 7 37 15. 
100 Cases X 254 15.53 2.68 432 1717 19.70 1333 2849 799 971 : x s 
Card c 155 5001 227 406 4 530 432 750 267 7.13 8 ea 
og 7 got 15: 
100 Cases X 2.73 15.86 321 4.35 1857 2032 14.06 29.67 8.80 10.47 : 50 470 
Booklet c 1.28 464 268 415 48 5.0 453 751 271 794 : 
Procedure 


It was intended that one-half of the males 
and females would take the card form first 
and the booklet form second within one week. 
The other half would take the booklet form 
first and the card form second within a week. 
The booklet form was administered in 


situation and the card form w 
vidually. 


à group 
as given indi- 
In actuality, as shown in Part D of 
© Table 1, the planned procedure was achieved 
with the male group, but not with the female 
group. Twenty females took the card form 


first and only 12 females took the card form 
last. 


Py 


Raw scores without the “K” correction d 
recorded for each individual on each for pu 
Pearson product moment correlation Is £d 
cients were computed for males, fema 


~ are 
2 ale. These 4 
the entire group on each scale. T 
Shown in Table 3. "ons for the € 
Means and standard deviations 10 


es an 
profi? 
ure ^ 


tire group and separate groups of 68 an 
32 females are shown in Table 2. E : 
of mean scores for males is shown in ig 
and that for females is shown in Figure of he 

In order to examine the magnitude 
standard error of estimate where ae 
booklet forms are assumed to be eq" 


and 
valent 


M, 


H, Pa 


Mean T-scores on the 


M, P. B. S. M, 


MMPI for 68 college males. 


Card Versus Booklet Forms of the MMPT 281 


H, 


P. P S. M. 


) me 


Mi 


Booklet | 


? LK F H. D H, 


Fic. 2. 


and to check the accuracy of the product 
moment correlations, a table of score devia- 
tions was constructed and correlation coeffi- 
cients were computed by means of the ratio of 
the estimated true variance to the observed 
variance. In these computations variance 
error was computed as one-half the variance 
of the distribution of difference in raw score 
between the card and booklet form and ob- 
Served variance was taken to be the average of 
the variance of the card and booklet form.’ 

Table 3 shows for each scale the correlation 
Coefficient secured in this manner (7a), the 
variance of the total distribution of differences 
in raw score between the two forms for each 
Scale (V4), the average of the variance for both 
forms (Vo), the standard error of estimate in 
Taw score points, the approximate T-score 
change at one standard error, and the variance 
ratios of card and booklet forms. 


* The formulae used for these computations were: 


Where Va is variance of difference between raw scores 
Sn card and booklet; i is an interval of 1 raw score point; 
P = 100; and fr is frequency of deviations per step 
interval, 
Ve 

Ip = - — 

r'a—1 Vs 
d booklet 


W. " " 
here 7^; is the correlation between card an 
e of card 


orm; V, = Va/2, and Vo is average varianc 
and booklet form i 


M Ps RB Se Ma 


Mean T-scores on the MMPI for 32 college females. 


Results 


With the exception of the coefficients shown 
for the lie scale (L), depression (D), and 
paranoia (Pa), the product moment correlation 
coefficients reported in Table 3 range from .72 
to .91. Omitting the validity scales, those 
representing actual syndromes of maladjust- 
ment approach the size of accepted reliability 
coefficients. The most reliable scales would 
appear to be masculinity-femininity (Mf), 
psychasthenia (Pt), and schizophrenia (Sc); 
all three above .85. 

Holzberg and Allessi (4) report test-retest 
coefficients on the short and long form of the 
MMPI taken within three days which range 
from .519 to .927. With the exception of hy- 
pochondriasis (Hs), psychopathic deviate (Pd), 
and hypomania (Ma), these coefficients ranged 
between .72 and .93. The scales above .85 
were the lie scale (L), validity (F), hysteria 
(Hy), and schizophrenia (Sc). 

Rotter, in commenting upon the test-retest 
reliability of the card form of the MMPI, re- 
ports coefficients ranging .71 to .83 (2). Hath- 
away in personal correspondence with this 
writer indicates that test-retest coefficients 
ranging from .61 to .85 have been secured on 
seven scales using individual versus group form. 

3 Reference to Table 3 shows that the separate coeffi- 
cients for males and females on the Mf scale are both 
lower than the coefficient for the entire group on this 


scale. Combining these groups increases the range and 
consequently the size of the correlation coefficient for 


the total. 


258 


Hathaway and McKinley (3) report reliability 
coefficients for the card form as follows: 


with 40 normal cases, r=.77+.044 
= with 47 normal cases, r—.742-.15 (test-retest) 
Pt with 200 normal cases, 7=.91+.07 (corr. split half) 
Hy with 47 normal cases, r 57 (test-retest) 
Ma, number not given, 7=.83 (test-retest) 
Pd with 47 normal cases, r=.71 (test-retest) 


Capwell (5) reports test-retest correlations 
on 85 public school girls ranging from .40 to .77, 
only three of which are above .72. She reports 
similar correlations on 98 home school girls 
which range from .33 to .71. The difficulty 
in comparing those studies to the present study 
is that only the one by Holzberg and Allessi 
employs a similar short time between tests. 
That is, it is difficult to estimate how much the 
reliability coefficient is affected by personality 
changes occuring in the length of time between 
tests. 

It would appear that, with the exceptions 
noted above for L, D, and Pa, correlation 
coefficients between card and booklet forms of 
the MMPI reported here in Table 3 are as high 
as those reliability coefficients reported for the 
card form alone. This would suggest that the 
group or booklet form of the MMPI could be 
used interchangeably with the card form for a 
college group. Saving in time required to ad- 
minister and score the booklet form make this 
preferable in many psychometric situations. 

Wiener (1), reports that on seven of the 


Wm. C. Coltle 


scales there is a slightly higher average score on 
the individual form than on the group form. 
Reference to Table 2 indicates that males 
score slightly higher on the booklet (group) 
form for all scales, except Ma, where mean 
scores are identical. . 
Mean scores for females are slightly higher 
for the booklet form also, except for Hs, Pt, 
and Sc. The combined mean scores for the 
100 cases (P<.01) are all slightly higher for 
the booklet form. It would appear that there 
is a tendency for a college population to n 
slightly higher on the booklet than on the car! 
form of the MMPI. A college population 
tends to place fewer items in the “Cannot Say 
category when the booklet form is used. T 
These results do not necessarily conflict w : 
the research of Wiener, because his group Wa" 
a non-disabled male veteran population v 
guidance center operated by the Departm 
of Education at St. Paul, Minnesota. ‘aide 
Let us consider the results shown in T2 ae 
It is evident there is no significant Te 
in the result of the two methods of ore 
the correlation coefficients between Car 


e- 
of the standard error of the estimate, €-£« 9^ 
half the variance of the raw score di he 
between card and booklet forms on t d 
scale is 7.76 and the standard error © 


B $ : means 
estimate is calculated as 2.78. This 


Table 3 


' Correlation Coefficients 


and Related Estimates for Card vs. Booklet Form 


of the MMPI with 100 Colle 3 
ge Students —— 
N z — == ——— of 
NE E È Gs Hy Pd Mt Pa PL 5 7 
res Males 68 31 39 3; 3j gem E 5 $0 
ra, Females 32 34 7 7 ř f E a Dk - e 82 
72/02. 80 60 93 39 6 8 
rey (Product i s 16 
, omn Wo dà qe sg a g 2 80 oa 56 90 8% 76 
coe Ee eee au d ‘or oc o 088 95 3&0 
d vu% ans ne Ead 1.61 326 692 777 406 519 316 6.64 ip 5? 
o=Ve b .05 57 2 s 3 eR g 7.57 529 
"wi euy ! 623 1705 2001 2838 1985 5697 734 575 D 
Points) 107 225 12; . 7 OL 
Approx. T-score U^ PH A 278 20 227 agio L5] 5 
change at 1 S.E. 3 4 3 4 ; : " 4 4 mi 
F= VarianceRatio — 145. 14; 139 105 142 m Ld Vas ins um i5 D 


Card Versus Booklet Forms of the MMPI 259 


that in two out of three administrations, the 
raw score earned on one form will be within 
22.18 of the score that would be secured had 
the alternate form been given. This in sum 
is the meaning of the correlation coefficient 
(r'e) of .85 derived for the Sc scale. 

The next to last row of Table 3 indicates the 
approximate change in T-score points for each 
shift of one standard error. Using the Sc scale 
again as an example, a change in raw score 
points of this magnitude results in a change 
of approximately four-tenths of a standard 
deviation. Such a change could be significant. 
. Profiles of means shown in Figures 1 and 2 
indicate small T-score differences between card 
and booklet form. They could be considered 
hormal profiles for a college group. 


Summary 


Correlation coefficients, means, and the 
standard deviations were computed for 68 
male and 32 female college undergraduate and 
graduate students to determine the equivalence 
of the card and booklet forms of the MMPI. 
With the exception of L, D, and Pa, these 
Coefficients range between .72 and .91. This 
is as high as the majority of the reliability 
Coefficients reported for the card form alone. 

Mean scores indicated in Table 2 suggest 
that a slightly more elevated profile would be 
Secured on a college population by use of the 
booklet form. Fewer items are left un- 
answered on the booklet form. . 

Inclusion of the standard error of estimate in 
Taw score points and the approximate T-score 
Change at one standard error in Table 3 permit 


one to judge whether Ite wishes to use these 
inventories as alternate forms. If these coeffi- 
cients are interpreted as reliability indices, the 
standard error of estimate is as shown in Table 
3. If the negatively skewed distribution of 
errors is recalled, that is, the tendency to score 
slightly higher on the booklet form, one is 
forced to adjust individual diagnosis upward 
from card to booklet form or resign himself to 
an inequivalence of forms. 


Received October 21, 1949. 


References 


1. Wiener, D. N. Differences between the individual 
and group forms of the Minnesota Multiphasic 
Personality Inventory. J. consult, Psychol., 
1947, 11, 104-106. 

2. Buros, O. K. The third mental measurements year- 
book. New Brunswick, Rutgers University 
Press, 1949. Comments of Julian B. Rotter on 
the Minnesota Multiphasic Personality Inven- 
tory on p. 60. 

3. Hathaway, S. R., and J. C. McKinley. A Multi- 
phasic Personality Schedule (Minnesota): I. 
Construction of the schedule. J. Psychol., 1940, 
10, 249-254; II. A differential study of hypo- 
chondriasis. J. Psychol., 1940, 10, 255-268; 
III. The measurement of symptomatic depres- 
sion. J. Psychol., 1942, 14, 73-84; IV. Psychas- 
thenia. J. appl. Psychol., 1942, 26, 614-624; 
and V. Hysteria, hypomania, and psychopathic 
deviate. J. appl. Psychol., 1944, 28, 153-174. 

. Holzberg, J. D., and Allessi, S. Reliability of short- 
ened MMPI. J. consult. Psychol., 1949, 13, 
288-292. 

. Capwell, D. F. Personality patterns of adolescent 
girls. I. Girls who show improvement in I.Q. 
J. appl. Psychol., 1945, 29, 212-228; and II. 
Delinquents and non-delinquents. J. appl. 
Psychol., 1945, 29, 289-297. 


sa 


tn 


> . : 
A Factor Analysis of MMPI and Aptitude Test Data 


Lt. Comdr. Ellsworth B. Cook (MSC) U.S.N. 
Tufts Medical School, Boston, Mass. 


and 


Robert J. Wherry 
Department of Psychology, Ohio State University 


This paper discusses the results of the ad- 
ministration of psychometric and psychomotor 
tests-to a group of 120 naval enlisted submarine 
candidates. The study comprised one phase 
of an investigation of the possible value of a 
wide variety of measures for the selection of 
submarine personnel (1). 


Procedure 


Subjects were randomly selected (1) and two 
groups of six subjects each were tested weekly. 
The tests employed were: 


I. The Minnesota Mulliphasic Personality 
Inventory: Hereinafter referred to as the 
MMPI, this test is designed to provide scores 
on all the more important phases of personality 
(2, 3, 4, 5, 6) and has been used extensively for 
the overall differentiation of normals from 
abnormals or persons predisposed to abnormal 
developments (7, 8, 9, 10, 11, 12). The short 
group form was used. 

II. Two-Hand Coordination Test: This is a 
motor pursuit task which has been employed 
frequently in the selection of military personnel 
(14, 15, 16, 17, 18, 19). The essential psy- 
chological principle involves the carrying out 
of two coordinated movements simultaneously 
So that there is a conflict of attention. The 
subject is rated on his ability to manipulate 
hand cranks in such a way as to keep a small 
button in continuous contact with an irregu- 
larly moving disc. An electrically operated 
Stop clock measures the total amount of time 
during which actual contact is maintained. 
Two 1-minute trials were used, 

* The st 


udy reported herein was " 
U.S.N. Medical Boemi Laboratory, US ia me 
rine Base, New London, Connecticut under 
Research Project NM-003-017, , 
Pinions expressed are those of the ai 
not to be construed as necessari 


uthors and are 
or the endorsement of the Navy 


ly Teflecting the views 
Department, 


. This 

III. Basic Battery of Written em 
battery consisted of a test of ari ceto 
reasoning (fractions, percentages, i owledge 
etc), mechanical and electrical et apti- 
(picture identification tests), mechan — levers, 
tude (simple principles of piya Classif- 
pulleys, braces, etc.), and the a^ ^ ui 
cation Test (verbal abilities). eco 21). 
standard Navy tests for enlisted men iod en- 
In order to qualify for Submarine € score 
listed candidates must have a combine 
of 100 on the GCT and arithmetic tests 

IV. Navy Enlisted Personal TATENA ttt 
consisted of form 2 of the Personal ndardize 
(23), a group test which presents a de m form. 
psychiatric interview in pencil and Per mprise 
The forced-choice type items which c s di 
the inventory are based on case p ; 
similarities between psychiatrically Y ; 
able and normal military personnel (24, 
Inasmuch as individual interviews MUS a e 
sarily be brief during large » 
programs, the P.I. serves as a rough i 
device to guide the psychiatrist in oF 
his interview. Scores on thc pen ry 
(personal history and medical histo 
treated as separate variables. . were rate 

V. Tank Performance. Subjects em ich 
on a five-point scale by a submarin whi us 
officer for their overall performance "m the 
dergoing routine training aria et on 
Escape Training Tank. This jdn metho., 
Ployed to acquaint personnel with pmarines, <1 
of escaping from a submerged Su h watel ^; 
à tower containing a column of fres ining “od 
in diameter and 100’ deep. A S ent e" 
permits an ascent from any desired jous pons 
hatches or locks are located at d e t 
in the tower. Subjects made two di 7 dep! of 
ascents from each of the 12’, 18’ and ident? 
They were rated on such items as €" 


‘This 


section? 
j were 


200 


bos 


Factor Analysis of MMPI and Aptitude Test Data 261 
Table 1 
Means and Standard Deviations of Variables Selected for Analysis 
Var. No. Variable Description Mean S.D. 
01 Lie Score 54.10 5.91 
02 F (Validity) Score 52.88 4.68 
03 Hs (Hypochondriasis) Score 46.58 6.83 
04 D (Depression) Score 48.26 7.67 
05 Minnesota | Hy (Hysteria) Score 52.73 6.89 
06 Multiphasicy Pd (Psychopathic Deviate) Score 52.40 8.53 
07 P.L | Mf (Interest) Score 51.27 8.91 
08 Pa (Paranoia) Score 47.84 7.22 
09 Pt (Psychasthenia) Score 44.78 6.46 
10 Sc (Schizophrenia) Score 46.77 6.29 
1t Ma (Hypomania) Score 58.55 842 
12 Two Hand Coordination (C Score) 11.75 1.88 
13 General Classification Test 58.88 6.30 
14 Navy —|Arithmetical Reasoning 56.92 9.16 
15 Basic Mechanical Aptitude 57.77 6.77 
16 Battery | Mechanical Knowledge 54.55 8.30 
17 Electrical Knowledge 54.92 8.23 
18 Tank Performance Grade 2.89 0.43 
19 Navy [Personal History’ 1.18 1.46 
20 P. I. Medical History 0.08 0.30 


apprehension, quickness of response to instruc- 
tions, errors of position on the line, “freezing” 
on the line, fighting to get out of the water too 
quickly, and so forth. 


Statistical Analysis and Results 


Data on 111 of the 120 subjects were utilized 
for statistical analysis. Three men were 
dropped because application of the standard 
Criteria indicated that their MMPI scores were 
invalid. Six others were excluded because 
records on them were incomplete. The 20 
Variables selected for analysis are listed in 
Table 1, together with their means and stand- 
ard deviations. 

A comparison of mean scores for the 9 
Personality scales with those obtained by 
Toughly similar groups (27, 28) indicated that 
Performance on the MMPI was typical of a 
young male adult population. 

The mean C score of 11.8 on two-hand co- 
Ordination placed the group almost one stand- 
ard deviation above the mean of the sample on 
Which this test was validated for naval use 
> » While the standard deviation of 1.9 was 
(yj enable to that of the standardizing group 


Scores were better than average on all the 
items of the Navy basic battery of written tests. 
This was to be expected since standards for 
the submarine service are higher than for the 
Navy generally (22). Mean scores closely 
approximated those of several hundred ex- 
perienced submariners who were reassigned to 
New London in the summer of 1945 (30). 

Mean scores were well below the established 
cut-offs for both sections of the Navy Personal 
Inventory. 

A tabulation of subject score range on each 
personality scale of the MMPI (Table 2), a 
graphical presentation of the MMPI profiles of 
individuals who scored 70 or above on two or 
more scales compared with the mean of the 
whole group (Figure 1), and a comparison of 
subject performance on two-hand coordination 
with several other submarine populations 
(Table 3) are available on request. Space 
does not permit their inclusion here. 

1 To reduce printing costs, Tables 2 and 3 and Fig- 
ure 1 have been deposited with the American Docu- 
mentation Institute. Order Document 2828 from 
American Documentation Institute, 1719 N Street, 
N.W., Washington 6, D. C., remitting $.50 for micro- 
film (images 1 inch high on standard 35 mm. motion 

icture film) or $.50 for photocopies (6 X 8 inches) 
readable without optical aid. 


Factor Analysis of MMPI and A plitude T. est Data 


The intercorrelations of the 20 variables are 
presented in Table 4. A modified (31) Thur- 
Stone Group Centroid (32) factor analysis 
yielded six independent factors to explain the 
intercorrelations obtained. The residuals aris- 
Ing when one attempts to explain the inter- 
correlations on the basis of the factor loadings 
are also included in Table 4. The factor load- 
Ings for the variables are given in Table 5. A 
factor loading represents the correlation be- 
tween a given measurement and one of the 
factors isolated. It may be positive or nega- 
tive depending on the nature of the relationship 
With the particular variable involved. The 
factor loading squared gives the percentage of 
Score variance of a given measurement which 
may be explained or predicted by the factor in 
question, A loading of .20 or higher is re- 
garded as significant. The reader is reminded 
that the labelling of factors is a matter of inter- 
pretative judgment rather than a problem in 


263 


statistics, and that he is free to consider and 
suggest alternate designations. 

Factor A has high positive loadings on the 
validity (.64), hypochondriasis (.79), psychas- 
thenia (.72) and schizophrenia (.93) scales of 
the MMPI, and lower but still significant 
loadings on the depression (.28), hysteria (.28), 
psychopathic deviate (.33), masculinity-femin- 
inity interest (.33), paranoia (.25), and hy- 
pomania (.41) scales of the MMPI, as well as 
on the personal (.21) and medical (.38) history 
sections of the Navy Personal Inventory. In 
general, then, it has significant projections on 
all items which measure neurotic tendencies, 
and is labelled tendency to personality malad- 
jusiment. The word “tendency” is employed 
to emphasize that the group was a normal one. 
Factor A appears comparable to the general 
factor “maladjusted tendencies” isolated by 
Cottle in his study of the MMPI and the Bell 
Adjustment Inventory (33). 


Table 5 
Final Factor Loadings 
= g 

Led e g o5 g 2 
eZ 48 z $ 
8. a ig UE E a5 Lae "8 
: 3 i be be Be Duo B 
Z 54 54 52 EL S'E 8 E 
3 E ot gd 95 TES E 
T E29 88 $5 Ss 552 B 
2 2 d 5 BÓ ea 20 ees o 
Variable Number and Description FactorA FactorB FactorC Factor D FactorE FactorF h: 
0! MMPI Lie Scale -06  -26 —5 e A e a 
2 MMPI Validity Scale ok 08 Be: E 2 | E^ 
tii MMPI Hypochondriasis Scale 79 219 — 48 12 ‘08 n 33 
ds MMPI Depression Scale 38 —.05 are 34 .09 — 35 ‘52 
og MMPI Hysteria Scale as Eo -3 Sk =M oo 29 
07 MMPI Psychopathic Deviate Scale 33 f ‘Ol 38 15 39 45 
oy MMPI Interest Scale 33 E m^ 48 —AM 00 — 34 
og MMPI Paranoia Scale A Ta 10  -—0 = AT 46 
d MMPI Psychasthenia Scale 2 49 "m —.08 A3 A8 98 
u MMPI Schizophrenia Scale 93 ‘09 E —.04 09 —.01 .50 
12 MMPI Hypomania Scale Al p? JD =i 36 — —08 7 
13 Two-Hand Coordination Test — 06 76 .06 za 25 -01 .67 
ia General Classification Test -42 " .05 E) .08 01 53 
E Arithmetical Reasoning E E" 34 44 43 -4 56 
i: Mechanical Aptitude 12 38 —.01 57 42 48 
17 Mechanical Knowledge Eb s 07 .01 —.04 -68 —.05 AT 
js Electrical Knowledge =A _ 06 —.19 —.09 —.01 —.02 06 
19 Tank Performance Grade Ed 10 07 6 -00 40 324 
20 Personal History T A2 08 05 09 — 49 


Medical History 


264 


Factor B has high positive loadings on the 
GCT (.76) and arithmetic (.71) tests and a 
lower positive loading (.24) for mechanical 
aptitude. This factor appears indicative of 
the ability to follow directions, and akin to the 
trait measured by traditional intelligence tests. 
Accordingly it is designated mumerical-verbal 
intelligence. The factor has a significant nega- 
tive loading (—.26) on the lie index of the 
MMPI, implying that persons who do well in 
intelligence tests tend to refrain from falsifying 
answers on personality tests. 

Factor C has its highest loading on the hy- 
pomania scale of the MMPI (.56) and signifi- 
cant positive loadings on mechanical aptitude 
(.34) and mechanical knowledge (.38) as well. 
This is a logical pattern in that overactive indi- 
viduals often find outlet in mechanical pursuits. 
The factor is called tendency to over-activity. 

Over-active persons possess a considerable 
degree of emotionality (as evidence the nega- 

tive loading of —.38 on the psychopathic 
deviate scale); this emotionality is shallow 
but varied. 

The factor has significant negative loadings 
on the “neurotic triad"—the hyperchondriasis 
(— .23), depression (— 48) and hysteria (—.43) 
scales of the MMPI—indicating that indi- 
viduals high on this factor tend to lack self- 
consciousness and self-criticism and have a 
direct acceptance of the environment. This 
suggestion of a “recklessness pattern” among 
men interested in submarine duty is somewhat 
similar to the finding of an Air Forces study of 
the traits of fighter pilots (34). 

It is interesting to note that factor C has 
nearly zero loadings on the two-hand coordina- 
tion test, although one would normally expect 
a correspondence between mechanical aptitude 
and two-hand coordination. The over-pro- 
ductivity in thought and action is evidently 
sufficient here to cause an attempt to think 
ahead, to “beat” the gadget by anticipating 
its movements, and, actually, to result in poor 
coordination performance. 

The negative loading of —.19 on tank per- 
formance grade shown for this factor is worthy 
of mention, even though the loading is just 
under the established criterion (20) of sig- 
nificance. Tank performance rating penalizes 
a man who “rushes” the line in an attempt to 
complete an ascent too quickly. Here again, 


Ellsworth B. Cook and Roberl J. Wherry 


the element of impatience and impulsiveness 
appears. The finding is suggestive in view of 
a wartime service report (35) issued after a sub- 
marine crew had been subjected to long sub- 
mergence and heavy depth charging. In the 
colorful language of that report: ". . . when 
the long dive was over . . . the people who 
lasted out were those of a more phlegmatic 
disposition who didn't bother much when 
things were running smoothly. The worriers 
and hurriers had all crapped out, leaving the 
plodders to bring home the ship” (35). 

Factor D is labelled tendency to paranoia 
from the loading of .48 on the paranoia scale 
of the MMPI. The high loading on the lie 
index of the MMPI is logical in that individuals 
tending toward that trait approach personality 
tests suspiciously, and are prepared to admit 
nothing which might show them in an un- 
favorable light. The loading of .38 on the 
interest scale suggests that the individuals 
high on factor D were the more effeminate 
members of the group. There is a negative not 
quite significant loading on GCT, suggesting 
that those who falsify on the lie questions of 
the MMPI do poorly on GCT. Thus, factors 
B and D give corroborative support to one 
another. There may well be an index of 
stupidity present here also, with the less in- 
telligent men falling more easily into the trap 
presented by the lie questions. 

Factor E has its highest loadings on electrical 
knowledge (.68), mechanical knowledge (57); 
mechanical aptitude (.43), and two-hand coor- 
dination (.36), and accordingly it is designated 
as mechanical coordination. The factor has à 
significant positive loading also on the validity 
scale of the MMPI (.31) indicating that per- 
sons high in mechanical coordination Were 
meticulous in answering the questions of the 
personality test. The negative loading on the 
interest scale (—.15), while not quite signifi" 
cant, implies that the more masculine members 
of the group were more proficient mechanically- 
Factor E indicates also that the expected CO” 
respondence between mechanical ability 2” 4 
two-hand coordination is present when loading 
on neurotic items are negligible, as is E 
case here. " 

Factor F has positive significant loadings ^ 
the masculinity-femininity interest scale ei al 
the psychasthenia scale (.47) and the perso? 


n 


Factor Analysis of MMPI and Aptitude Test Data 


history section of the P.I. (.40). The signifi- 
cant negative loading on mechanical aptitude 
(—.41) is taken to indicate that a man leaning 
toward the feminine side of the interest scale is 
likely to get a lower score in mechanical tasks 
than will a person whom this scale measures as 
More positively masculine in interests. This 
Supports the Terman-Miles view that there is 
a pronounced relationship between masculinity 
and mechanical pursuits at every educational 
level (36), and Strong’s definition of mascu- 
linity scores as an interest in things or objects 
rather than in persons or personalities (37). 
The most likely designation for factor F ap- 
Pears to be lendency to femininity of interest 
pattern. The high loading on psychasthenia 
shown for this factor suggests that the more 
efleminate man tends toward compulsive be- 
havior; this is consistent with the MMPI test 
development where this is regarded as more a 
eminine than a masculine trait (13). 

3 One purpose of each area study of the type 
*ported herein is to select single or composite 
p incen, of the truly basic factors isolated, 
fu relational analysis to significant components 
rina in other areas (1). All six factors 150- 
ated in this area investigation will be repre- 
Sented in the final matrix. 


Summary 


te 1. Men who do well on GCT and arithmetic 
Sts tend to refrain from falsifying on the lie 
rons interspersed throughout the MMPI. 
is t When neurotic elements are absent, there 
ch ne expected correspondence between me- 
anical ability and two-hand coordination. 
Min , When such elements are present in 
vidus] Tsonality , they tend to make an indi- 
e i rigid and confused with two-hand 
cordination, 
endi, man whom the MMPI estimates as 
is likely toward femininity of interest pattern 
abilit r to do less well on tests of mechanica 
md than will one whom the Mf interest scale 
Tes as more positively masculine. 
reckles trait suggestive of over-activity es 
Patter, Sness was found in the persona ity 
arin n of some subjects. While alertness ie 
arine are pre-requisites for successful su : 
er action, there is some evidence tha 
active individuals may find it difficult to 


265 


tolerate prolonged submergence and confine- 
ment. Presumably the “hurrier” and the 
*plodder" both have place in the complete 
scheme of underseas operation, with its long 
periods of monotony interspersed with mo- 
ments of intense activity. 

5. The evidence of a relationship between 
performance on intelligence and aptitude tests 
with personality traits as measured by the 
MMPI is considered worthy of note inasmuch 
as the minor personality accentuations found 
in this sample were within the generally accept- 


able ranges. 


Received September 26, 1949. 


References 


1. Cook, E. B., and Wherry, R. J. A study of the 
interrelationships of psychological and physio- 
logical measures on submarine enlisted candi- 
dates: I. History, experimental design and 
statistical treatment of data. Report No. 1 
BuM&S Research Project NM-003-017, U. S. 
Naval Medical Research Laboratory, U. S. 
Naval Submarine Base, New London, Conn., 
9 March 1949. 

. Hathaway, S. R., and McKinley, J. C. A multi- 
phasic personality schedule: I. Construction of 
the schedule. J. Psychol., 1940, 10, 249-254. 

3. McKinley, J. C., and Hathaway, S. R. A multi- 
phasic personality schedule: II. A differential 
study of hypochondriasis. J. Psychol, 1942, 
10, 255-268. 

4. Hathaway, S. R., and McKinley, J. C. A multi- 
phasic personality schedule: ITI. The measure- 
ment of symptomatic depression. J. Psychol., 
1942, 14, 73-84. 

's. McKinley, J. C., and Hathaway, S. R. A multi- 
phasic personality schedule: IV. Psychasthenia. 
J. appl. Psychol., 1942, 26, 614-624. 

. McKinley, J. C. and Hathaway, S. R. A multi- 

ersonality schedule: V. Hysteria, hypo- 

J. appl. Psy- 


t2 


phasic p d i 
mania and psychopathic deviate. 
chol., 1944, 28, 153-1 74. 

7. Morris, W. W- A preliminary evaluation of the 
Minnesota Multiphasic Personality Inventory. 
J. clin. Psychol., 1947, 3, 370-374. 

Hunt, H. F., Carp, A.,etal. A study of the differ- 
ential diagnosis efficiency of the Minnesota 
Multiphasic Personality Inventory. J. consult. 
Psychol., 1948, 12, 331-336. 

Schiele, B. C., Baker, A. B., and Hathaway, S. R. 

' ^ The Minnesota Multiphasic Personality Inven- 

n.d. Departments of Neuropsychiatry 


tory. SEL, y 
hology, University of Minnesota 


and of Psyc 


Medica] School. j 
Meehl, P. E. Profile analysis of the Minnesota 


Multiphasic Personality Inventory in differential 
diagnosis. J. appl. Psychol., 1946, 30, 517-524. 


266 


11. Clark, J. H. Application of the MMPI in differ- 
entiating A.W.O.L. recidivists from non-recidi- 
vists. J. Psychol., 1948, 26, 229-234. 

12. Abramson, H. A. The Minnesota personality test 
in relation to selection of specialized military 
personnel. Psychosom. Med., 1945, 7, 178-184. 

13. Hathaway, S. R., and McKinley, J.C. Manual of 
the Minnesota Multiphasic Personality Inven- 
tory, Revised Edition. New York: The Psycho- 
logical Corporation, 1943. 

14, McFarland, R. A., and Channell, R. C. A two- 
hand coordination apparatus for appraising apti- 
tude for flying. Division of Research, C.A.A., 
Washington, D. C., March 1942. 

15. Anon. The two-hand coordination test perform- 
ance of submarine men. Brown University, 
Providence, R. I. Report No. 5, project 44, 
Section D-4, NDRC, September 1942. 

_ 16. Graham, C. H., Riggs, L. A., Bartlett, N. R., et al. 

A report of research on selection tests at the 
U. S. Submarine Base, New London, Conn. 
Brown University, Providence, R. I. OSRD 
report No. 1770, project 44, Division 7, June 
1943. 

17. McFarland, R. A., and Channell, R. C. A revised 
two-hand coordination test. Airman Develop- 
ment Division, C.A.A., Washington, D. C. Re- 
port No. 36, October 1944. 

18. McFarland, R. A., and Franzen, R. The Pensacola 
study of naval aviators. Division of Research, 
C.A.A., Washington, D. C. Report No. 38, 
November 1944. i 

19. NRC Comm. on Selection and Training of Aircraft 
Pilots. Report on the Boston-Midwest project. 
Division of Research, C.A.A., Washington, D. C., 
Report No. 52, November 1945. 

20. U. S. Navy. Arithmetical reasoning test and 
mechanical aptitude test. Bureau of Naval Per- 
sonnel, Training Standards Section, Standards 
and Curriculum Division, Test and Research 
Unit. NavPers 16992, December 1944 (R). 

21. U.S. Navy. Electrical knowledge test, mechanical 
knowledge test, general classification test. Bu- 
reau of Naval Personnel, Training Standards 
Section, Standards and Curriculum Division, 
Test and Research Unit. NavPers. 16994, 
December 1944 (R). 

22. Willmon, T. L. Outline and discussion of methods 
Íor selection of submarine reserve personnel. 
U. S. Naval Medical Research Laboratory, U. S. 
Naval Submarine Base, New London, Conn. 
16 February 1948. 

23. U. S. Navy. Navy enlisted personal inventory, 


form 2, NavPers. 16845, IBM Form LT.S. 
1100 A 1165 (R). 


Ellsworth B. Cook and Robert J. Wherry 


24. Shipley, W. C., Gray, F., and Newbert, N. Stand- 
ardization and validation of the personal inven- 
tory: psychiatric criterion. OSRD report No. 
1606, Brown University, Providence, R. I. June 
1943. 

25. Shipley, W. C., and Graham, C. H. Final report 
in summary of research on the personal inventory 
and other tests. Applied Psychology Panel, 
NDRC, Report No. 10, project N-113, August 
1944. 

26. Kogan, L. S., Wantman, M. J., and Dunlap, J. W- 
Analysis of the personal history inventory. 
Division of Research, C.A.A., Washington, D. C. 
Report No. 42, February 1945. 

27. Wiener, D. N. Differences between the individual 
and group forms of the MMPI. J. consult. 
Psychol., 1947, 11, 104-106. 

28. Clark, J. H. Some MMPI correlates of color re- 
sponses in the group Rorschach. J. consult. 
Psychol., 1948, 12, 384-386. 

29. Bartlett, N. R. Review of research and develop- 
ment in examination for submarine training 
1942-1945. Report No. 2, BuM&S Research 
Project NM-003-036, U. S. Naval Medical Re- 
search Laboratory, U. S. Naval Submarine Base, 
New London, Conn. (In preparation.) 

30. Bartlett, N. R. Report on correlations of tests 
with grades in submarine school. Report No. 2, 
BuM&S Research Project X-243 (sub. 47), U. 8. 
Naval Medical Research Laboratory, U. S. Naval 
Submarine Base, New London, Conn. 13 Feb- 
ruary 1945. 

31. Wherry, R. H., Brogden, H. E., and Gaylord, R. H. 
Wherry-Brogden, Gaylord method of factor 
analysis. Personnel Research Section, Adjutant 
General's Office, Department of the Army, 
Washington, D. C. (unpublished). 

32. Thurstone, L. L. Multiple factor analysis. 
cago: University of Chicago Press, 1947. 

33. Cottle, W. C. A factorial study of selected instru- 
ments for measuring personality and interest- 
Guidance Bureau, University of Kansas, n.d. 

34. U. S. A. A. F. Psychological research on opcra 
tional training in the continental air forces: 
AAF Aviation Psychology Program, Report Nob 
16, Washington, D. C.: U. S. Government Print 
ing Office, 1947. 

35. U.S. Navy. Depth charging of the USS. Fuse 
Section 71 T of report, Enemy anti-submarin 
measures. n.d. í 

36. Terman, L. M., and Miles, C. C. Sex and per 
sonality. New York: McGraw-Hill Co., 1936. 4 

37. Strong, E. K., Jr. Vocational interests of men pu 
women. Palo Alto: Stanford University Pre*^ 
1943. 


Chi- 


aE  —— 
pc —— 9. -— — 9M 


A Combined Oral Reading and Psychogalvanic Response Technique 
for Investigating Certain Reading Abilities 
of College Students 


Homer L. J. Carter 
Western. Michigan College 


a of this study is to describe and 
ir rg a combined oral reading and psycho- 
Pio response technique for investigating 
It is un abilities of college students. 
B aam ated that the procedure may be of 
readin il only in determining such factors as 
å nn rate, comprehension, and errors, but as 
Sitiaiinn of discovering how much the reading 
this Ed affects the individual. In therapy 
— meu for as Maier (4) has shown 
be ira ion state, not behavior symptoms, must 
Smee In order to evaluate this pro- 
describ, E" materials and apparatus have been 
Been i » Superior and inferior readers have 
em 3 udied, resulting group data have been 
mpared statistically, and inferences have 
fen set forth tentatively. 


Materials and Apparatus 


at Gy Oral Reading Paragraphs Test (3). 
taken f the even numbered paragraphs were 

Bi. rom the Gray Oral Reading Paragraphs 
cards These paragraphs were typed on 335 
tions and on the back of each card five ques- 
&rapl were typed. Tt is assumed that the para- 
ite given in the order 2, 4, 6, 8, 10, 12 con- 
Steps ; a scale of increasing difficulty but with 
Origir of greater magnitude than that of the 
chose » Scale from which the paragraphs were 

Sadi This modification of the Gray Oral 
reading Paragraphs Test provides a record of 
reading rate, a comprehension score, and such 
omitt, & errors as words aided, mispronounced, 

€d, substituted, inserted and repeated. 

out d Pparatus. The apparatus used through- 
eter? d experiment was a “Maico Psychom- 
(Delta ). Change in palmar skin resistance 
Tom z R) can be measured in units ranging 
multi ero to 100 and stated in ohms by merely 
jn Dlying the indicated unit on the scale 
In this experiment, only change 1n 


Palm ` 
ar ski : : í 
T skin resistance in response tO given 


stimuli is considered and it has been assumed 
that the extent of the deflection of the galva- 
nometer is roughly proportionate to the in- 
tensity of the emotion or degree of frustration. 


Procedure 


Selection of Students. Twenty superior read- 
ers were selected from students scoring above 
the 75th percentile on Test III (reading) of 
the Ohio State Psychological Examination and 
twenty inferior readers were chosen from those 
scoring below the 25th percentile. Such 
factors as age, sex, and academic training were 
considered in making up the groups of superior 
and inferior readers. No attempt was made 
to control the factors of scholastic aptitude or 
general intelligence. First, second, third, and 
fourth year college students constituted both 
groups although the number of freshmen was 
equal to that of all upperclassmen. 

A pplication of Technique. As the apparatus 
was being applied, the examinee was given 
paragraph 2 and the following directions, 
“Read the paragraph on the card aloud. 
Avoid all reading errors. After you have 
finished, you will be asked questions concerning 
the material read.” The response to each 
paragraph was recorded in terms of number of 
errors, time required for reading, change in 
ar skin resistance and comprehension 
score. A record of these data in the case of a 
freshman with a percentile of 8 on the reading 
section of the Ohio State Psychological Ex- 
on is shown in Table 1. 


palm 


aminati 


Results 


from the administration of a 
combined oral reading and psychogalvanic re- 
sponse technique in the study of superior and 
inferior readers at the college level have been 
summarized as shown in Table 2. In deter- 


Data resulting 


267 


268 
Table 1 
Data Resulting from Application of Technique 
in Individual Case 
Compre- 
hension 
Para- Time Delta (Weighted 
graph Errors in Sec. R Score) 
4 0 17 10 5 
6 1 19 16 6 
8 0 21 24 6 
10 + 30 32 0 
12 5 28 32 0 


mining scores in comprehension, weighted 
values of 1 to 5 were assigned to questions on 
paragraphs 4 through 12, respectively. Be- 
cause paragraph 2 was used in preparation for 
the examination, data resulting from its use 
were not included. The mean and sigma were 
determined for each distribution and the 
standard error of the difference of the means 
of small samples was found (2). In de- 
termining whether or not the differences be- 
tween groups were significant, ¢ was calculated 
for each difference. Differences between the 
means in the tabulation of the total number of 
errors, average time and comprehension are 
significant at the 1 percentlevel. Theaverage 
change in palmar skin resistance as shown by 
Superior and inferior readers as they read 
paragraphs 4, 6, 8, 10, and 12 is not statistically 
significant (/—.66). However, on the more 
difficult paragraphs 8, 10, and 12 the difference 
in the means is statistically significant (/— 1.80) 
at a point between the 5 and 10 per cent levels. 

An analysis of data resulting. from this 
study shows three trends which may be signifi- 
cant. Eight good readers and thirteen poor 


Homer L. J. Carler 


readers show an increase in frustration as 
number of errors increased and a loss in com- 
prehension occurred. This suggests that more 
poor readers are frustrated by these reading 
disabilities than good readers. It is also ap- 
parent that 7 good readers and 6 poor readers 
show a decrease in frustration as errors mM- 
creased and a loss in comprehension occurred. 
Consequently it may be that some individuals 
in both groups are not emotionally affected by 
their errors and inability to comprehend. 
Furthermore, 1 poor reader and 2 good readers 
who show increased frustration demonstrate 
decreases in both errors and in comprehension. 
This may indicate that in their cases emotional 
tension is due only to inability to understand 
what is read. 


Summary 


1. As generally expected such factors as 
number of errors, rate of reading and compre- 
hension scores differentiate significantly su- 
perior and inferior readers. 

2. Average change in palmar skin resistance 
cannot be expected to differentiate superior and 
inferior readers except as the material becomes 
comparatively more difficult. 

3. Nevertheless, this study is significant 
because it suggests the importance of apply- 
ing measures of frustration simultaneously 
with measurement of reading achievement. 
Consideration of frustration in human behavior 
is in keeping with the contributions of Maier 
in his studies of animal behavior. 

4. The technique described in this study 
provides objective data as to how much the 
accumulative effect of certain reading erro! 
and inabilities affects the reader. This infor- 
mation may be of value not only in the diag- 


"Table 2 
Data Resulting from Administration of Technique to Good and Poor Readers 
= fi Se 
Good Readers Poor Readers 
: t 
Mean Sigma Mean Sigma S.E.p 
Total Number of Errors 4 3.29 1 5 54 ao 
Average Time in Seconds 18.75 2.30 = 7 ae) v 412 
Comprehension Score 63.75 1408 Es ied 414 ad 
Average Change in Palmar Skin Resistance f D in 
for Paragraphs 8, 10, 12 29 17.18 39.5 1883 5.85 E 
: iS à 3.09 


Combined Oral Reading and Psychogalvanic Response Technique 269 


nosis of reading disability but in its correction 
as well. As a result of this information, 
therapy can be directed toward the reduction 
of frustration, for example by providing easier 
material, and not merely toward the eradica- 
tion of behavior symptoms such as rate, errors 
and comprehension. 


Received May 1, 1950. 
Early publication. 


References 


1. Carter, H. L. J. A combined projective and psycho- 
galvanic response technique for investigating 
certain affective processes. J. consult. Psychol., 
1947, 11, 270-275. 

2. Garrett, H. E. Statistics in. psychology and educa- 
lion. New York: Longmans, Green, 1947. 

3. Gray, William S. Standardized oral reading para- 
graphs. Bloomington, Illinois: Public School 
Publishing Company. 

4. Maier, Norman R. F. Frustration. New York: 
McGraw-Hill Book Company, Inc., 1949. 


Geographical Sampling in Testing the Appeal of Radio Broadcasts 


John Gray Peatman 
City College of New York 


and 


Tore Hallonquist 


Columbia Broadcasting System 


For more than a decade hundreds of national 
and local radio programs broadcast over the 
outlets of the Columbia Broadcasting System 
have been studied with the Program Analyzer 
method for the purpose of determining their 
strong and weak points and improving them in 
the light of listeners' reactions and comments. 
Until 1947 all of these tests were conducted in 
the New York studios and consequently the 
question arose as to whether audiences in other 
parts of the country would give reactions 
similar to those obtained from New York 
listeners. To get at least a preliminary answer 
to this question, CBS employed the Program 
Analyzer technique for eight weeks in Holly- 
wood and two weeks in Boston during the latter 
part of 1947. 

Thus, this article describes the results of a 
series of Program Analyzer tests with New 
York audiences, Los Angeles audiences, and 
Boston audiences. The New York and Los 
Angeles audiences were presented two nelwork 
programs. One of these, a comedy-drama, 
originated in Hollywood and featured mainly 
West Coast talent. The other network pro- 
gram, a musical variety show, originated in 
New York and featured mainly East Coast 
talent. For the comparison of Boston and 
New York listeners an audience participation 
program local to Boston was used. One might 
expect considerable differences among these 
samples of the radio audience because of local 
familiarities and appeals. The Hollywood 
program, for example, might be expected to 
have greater appeal on the West Coast and 
the New York program to have greater appeal 
on the East Coast. Similarly, one might 
expect the local Boston program to have a 
much stronger appeal for Boston audiences 
than for New York audiences, 


The Program Analyzer Technique 


Before describing the results of the above 
comparisons and considering the question 
whether the Program Analyzer technique has 
general or only limited usefulness in testing 
the appeal of radio programs, we shall briefly 
review the Program Analyzer technique; 
originally developed in 1937 by Frank Stanton; 
now President of the Columbia Broadcasting 
System, and Paul Lazarsfeld, now Chairman 
of the Department of Sociology at Columbia 
University. 

The Program Analyzer method brings à 
sample of listeners into direct contact with @ 
radio or television program under scientifically 
controlled conditions and records the listeners 
reactions to each successive second of the 
broadcast. 

There are two versions of the Program 
Analyzer: “Little Annie" which has been con- 
tinuously in operation at CBS since 1940 an 
"Big Annie" developed by CBS engineers an 
used at CBS since 1944. d 

"Little Annie" consists of a moving tape WT 
a battery of 20 capillary pens. Ten of ther’ 
pens draw continuous red lines which mt 
guide lines on the tape. The other ten pe 
draw lines in green. Each red and green p 
is electrically connected with a red and gre? 
push-button in the Program Analyzer stu 
so that when a button is pressed down the 
Jogs off the guide line on the tape and A 
off as long as the electrical circuit rema 
closed. Two “Little Annies” are used si” 


taneously and can record reactions of 
listeners, 


10; 
pen 


ts 

Regularly, broadcast spot announcem 

invite listeners to participate in a pc 
Analyzer test, People who respond to © 

accouncements are classified on the bas! 

270 


| 


Pe at 


Geographical Sampling in Testing Appeal of Radio Broadcasts 


Sex, age, education, occupation, and avail- 
ability at Specific times. This information is 
Punched on IBM cards, one card for each 
Person. Each time a test is scheduled, the 
Cards are run through the sorting machine to 
yield a sample of listeners that is controlled 
With respect to such factors as sex, age, and 
€ducation—all of which have been found rele- 
vant for stratification. sampling of radio 
listeners, 
Listeners are invited in groups of from 10 to 
20 people. They are seated in the Program 
nalyzer Studio, offered cigarettes and put at 
fase. Each person is given a red push-button 
for the left hand and a green button for the 
right hand, 
, The group is usually given a pre-test ques- 
tionnaire containing questions about home 
listening habits, program preferences, attitudes, 
and so forth, A recording played over the 
loud-speaker system next informs the listeners 
{Pout the test procedure. They are asked to 
listen closely to the program, to press the green 
puton and keep it pressed down when they 
aie a program part is good—when they want 
ip listen to it; to press the red button when they 
Eie program part is poor—when they don't 
whut to listen to it; to press neither button 
Tren they are indifferent to what they hear. 
ie lights are dimmed and a recorded version 
p the program is then presented. One of the 
niei assets of the technique is that listeners, 
mike up in the momentum of the program, 
The "s Spontaneous, non-reflective responses. 
Vides Togram Analyzer, it may be said, po 
eliciti a psychological situation favorable or 
who ng a true response, even from listeners 
thei normally would not be able to articulate 
e E Teactions and ideas. Because the pro- 
ure itself—the pressing of the buttons—is 
CONS E mely simple, spontaneous, and con- 
Sistib] Y rocedure, the listener is almost = 
Vario Y led on to take a stand in regard oh e 
comm: aspects of the program. Once he has 
Ubse itted himself, oral articulation during 
Quent interviewing is facilitated. 
g © Interviewer in charge starts the record- 
brog Paratus in an adjoining room. As the 
am proceeds he notes down the positive 


an x h 
indi Negative reactions of each listener, a$ 
nome ed by the jogging pens. Thus, he 

at the conclusion of the show the 


271 


spontaneous reaction pattern of each subject. 

He goes back into the studio, the lights are 

turned on, the main questionnaire is distrib- 

uted, and the attempt made to elicit the. 
listeners’ considered opinions about what they 

heard: their attitudes towards the program as a 

whole, what they liked and did not like about 

it, their reactions to specific elements, their 

opinions of the cast, etc. 

Finally, there is a period of oral interviewing. 
First calling on a listener whose reactions to the 
program were unfavorable, next calling on a 
listener whose reactions indicated satisfaction 
with the show, the interviewer encourages each 
person to talk informally about the program. 
Thus, two opposing points of view are estab- 
lished right at the start so that no listener need 
feel that his opinion is in the minority. Each 
participant is asked to tell how he felt about 
each aspect of the program and, in turn, is 
asked to give his conscious reasons for his 
reactions recorded on the tape. 

The questions and answers during the inter- 
view period are taken down in shorthand and 
transcribed. 

Thus, the Program Analyzer technique 
(“Little Annie”) yields three sets of data: (1) 
the second-by-second approval, disapproval, 
and indifference reactions of each listener as 
recorded on the Program Analyzer tape; (2) 
listener attitudes and opinions as expressed in 
writing in the test questionnaire; and (3) 
listener attitudes and opinions as expressed in 
the oral interview. 

The Program Analyzer technique and the 
type of information obtained when “Big 
Annie” is used are the same as for “Little 
Annie,” except that approval and disapproval 
reactions of the group as a whole are totalized 
instead of being differentiated for each listener. 
Furthermore, due to the size of the group 
present in the studio, it is not feasible to inter- 
view each subject at the end of the test session. 
Thus, the investigator relies mainly on the 
written questionnaire results for the verbalized 


opinions of the listeners. 


The Hollywood Program 


This national network comedy program 
originally developed by the CBS Network 
Program Department in Hollywood has main- 


272 
Table 1 
Composition of the Two Samples for the Test 
of the Hollywood Program 
New York Los Angeles 

Sample Sample 
N=76 N=95 

Sex: : 
Male 42% 41% 

Female 58 53 
100% 100% 

Age: 

Under 26 26% 16% 

26-40 41 34 

Over 40 33 50 
100% 100% 

Education: 

Grammar School 17% 31% 

High School 61 40 

College 22 29 
100% 100% 


tained a very high listener appeal for a number 
of years as indicated by audience ratings. 

A broadcast of the program was tested simul- 
taneously with a New York sample of 76 
listeners and a Los Angeles sample of 95 
listeners. The Los Angeles sample was tested 
with “Big Annie” and the New York sample 
with “Little Annie” in six successive sessions 
with approximately 12 subjects at each session. 
The composition of each sample, with respect 
to sex, age, and education is shown in Table 1. 
Some of the subjects were regular listeners to 
this program, others had never heard it. 

On the whole, the Los Angeles sample was 
more familiar with the Program than the New 
York sample as shown in Table 2. The over- 
all reaction of the listeners to the Program, as 
measured by the Program Analyzer scores,! 
was identical for both samples, i.e., both tests 
yielded an average attitude Score of 32 which 
is well above average for a comedy program. 

But just as striking is the Comparison of the 
minute by minute reactions of the two samples 


John Gray Peatman and Tore Hallonquist 


Table 2 


Prior Listening Habits to the Hollywood Program 


New 


Los 
York Angeles 
Listen to Program Sample  Samp'e 
2 
Regularly (almost every broadcast) 23% m 
Frequently (about every other week) — 13 x 
Occasionally (once in a while) 33 " 
Never 31 b 
" 100% 


100% 


to the sequences of events on the broadcast 
These reactions are portrayed in Figure * 
The trend of listener reactions of each ig 
is given by the heavy line graph on the 1 by 
charts of Figure 1. This trend is measured à 
the Program Analyzer scores for Mu 
units of the program. It will be observed im 
the trend lines for both samples are p de 
similar, despite the fact that 31 per cent goi 
New York sample had not heard the prog te 
before. The Los Angeles sample a 
more quickly in their approval of the pet 
a result probably attributable to the gr with 
familiarity of the Los Angeles uino. iyi 
the program and with its leading pu 8, 
Thelow pointsin each case are the psp 0 
à typical result in Program Analyzer te S tion 
commercial programs. The over-all ion 
of the listeners as obtained from the jeg test 
naire administered at the end of each ples: 
session is practically the same for both pet 
as indicated in Table 3. Conditional n 
are those who would listen to future BIOS The 
if the programs were "improved a bit. mp 
main appeals of the program to both d which 
Were very similar, as indicated in Table incip? 
Shows the relative appeal of the pF 


Table 3 


issa tisfit 
Per Cent Satisfied, Conditional, and pie 
Listeners to the Hollywood Progr 


New York 
Sample 
Satisfied listeners 53% 24 
Conditional listeners 27 21 
Dissatisfied listeners 20 e 


100% 


Sample 


$ 
Los Angel? 


PERCENT AND RATING 


PERCENT AND RATING 


Frc. 1, 


listener reactions, Hollywood listeners. 


Geographical Sampling in Testing Appeal of Radio Broadcasts 2 
JW E; bios dna S ee) 
M 2 4 o 8 10 12 4, a w 7120! ea i267! 39 ' 30 


1-59 
80 Hollywood Analysis | | 


| UN Hi 
o 
7 iND OF 
1 USTERER 
Ts a | RUNE 


20 


40 


ea lalla 


22 24 26 26 30 


50 


^0 


40 


baji: REND OF 
SIR USTEMIR 
REACTIONS. 


20 


rs: Hollywood comedy drama. Top—Profile of 
file of listener reactions, New York listeners. 


d New York listene 
Bottom—FPro! 


Comparison of Hollywood an 


274 
Table 4 


E f the 
ive Appeal of Program Elements o 
The Relauve aligavool Program 


New York Los Angeles 
Sample Sample 
The gags and jokes 49%, 49%, 
Personalities and characters — 37 38 
Story and plot 9 9 
Enjoyed none of these 5 4 
100% 100% 


aspects of the broadcast. The story and plot 
had relatively the least appeal for both samples. 
It is to be emphasized that these figures de- 
scribe the relative appeal of the program ele- 
ments as obtained from the questionnaire 
which asked: “This is what I enjoyed most 
about the show":—The gags and jokes;—The 
story and plot;—The personalities and char- 
acters;—Enjoyed none of these. Each subject 
checked one answer to indicate which of the 
foregoing elements was most enjoyable. 

The subjects were also asked to indicate 
whether or not they were anxious to hear the 
outcome of the plot. Their replies are de- 
scribed in Table 5. 

Finally, the great similarity of the reactions 
of both samples to the broadcast is shown by 
Figure 2, which describes the comedy appeal 
of each of the characters on the program. 
Each percentage value is the per cent checking 
each character as "very funny." It will be 


John Gray Peatman and Tore Hallonquisl 


[I] New tab tem d 


lov Angeles. Sample 


IHE StaR 


Supporting Cherecters 


on 

Fic. 2. Comedy appeal of program cliaragters o 
Hollywood program (per cent checking cach chara 

as "Very Funny"). 


and detailed reactions of two different, d 
graphically located samples hearing the sem 
broadcast. Such identity of response certain y 
is not to be expected for all programs, pau 
larly for programs that may have sectional n 
local flavor in their content. It is also to b 
noted that both samples were city samples s 
Such similarity of response might not be €* 


Table 6 


Composition of the Two Samples for the 
New York Variety Show 


Los Angeles 


New York Sample 
observed that the ranks for each of the five pom N=5 
characters are the same for both samples even - 
though there are some differences in the per- = | " 4996 
centage value for each. Temal T % 31 
From the foregoing figures and tables we see Moi — 
a remarkable similarity of both the over-all 100% 100% 
Age: 
Table 5 Under 26 28% a 
` 26-40 37 

Interest in the Outcome of the Story of the Over 40 35 43 
Hollywood Program 2 —— 
dum 100% 

p > Los Angeles Education: n 
n i 

E z Sample Grammar School 13% 39% 

Anxious to hear outcome 72% 73% High School 60 41 

Not anxious 28 27 College 18 20 
SR 
100% 100% pe 100% 


| 


| 


Geographical Sampling in Testing Appeal of Radio Broadcasts 


275 


tno minutes) 
mm 5 
I T T * D 2 ds 30 
New York Listeners 
50) { i i 

e \ 

Z 40 | | ^ 

E T un X i 

< wood Listener / 

a Hollywood List " / I 
LY 

= I. 

a T S 

« 30 i 

- WE 

z ^ ba 

5 4 | 

zo | fi N Á l 

olg Y 
xj 
RE MER AP oousi fan ouise m Meet reme I cano! ens "necem I tmi! ac 
Fic. 3. Comparison of reaction trends for New York and Hollywood listeners, 


ier in the comparison of rural or smalltown 
Jects with those of urban dwellers. 


The New York Variety Show 


Sn the Second radio program to be considered 
Tew pe aparisons again were made between 
Thisti ork listeners and Los Angeles listeners. 
it me, the program originated in New York 
È, and its chief talent had obtained millions 
ever T Coast fans long before the West Coast 
the P him. However, as in the case of 
as à olly wood program, the New York show 
also developed a great deal of national 


Pri Table 7 
m. Listening Habits to the New York Variety Show 
New Los 
Li York Angeles 
‘Sten to the Program: Sample Sample 
Egu 
uni (almost every broadcast) 35% 2476 
rad (about every other week) 20 13 
Ne nally (once in a while) 28 40 
P 17 23 
100% 


100% 


New York musical variety show. 


appeal over the CBS network. A particular 
broadcast was tested simultaneously in New 
York and Los Angeles. The two samples con- 
sisted of 71 listeners in the New York test and 
94 listeners in the Los Angeles test. The com- 
position of the two samples is given in Table 6. 

As for previous familiarity with the program, 
we see in Table 7 that the situation is reversed 
from the Hollywood program samples: more of 
the Los Angeles sample had never heard the 
New York show before and a greater propor- 
tion of the New York sample listened regularly 
or at least "about every other week." 

As in the preceding tests of the Hollywood 
program, *Big Annie" was used for the Los 
Angeles sample whereas "Little Annie" was 
used in a series of seven successive sessions 
with the New York sample. The over-all 
results of the tests, as measured by the Program 
Analyzer score, were not too dissimilar. 

The reactions of the two samples to the 
successive program units of the broadcast are 
portrayed by the line graphs of Figure 3, each 
of which is based upon the average Program 
Analyzer scores for each program unit. The 
general trend of reactions for both samples 


216 


Table 8 


i iti issatisfied 
t Satisfied, Conditional, and Dissatisi 
Eoo to the New York Variety Show 


John Gray Pealman and Tore Hallonquist 


Table 10 


Performances Liked Most and Liked Least 


New " Lo 
"Ee Uo Sample  Sempie 
Performances liked best: 
Satished pud Pe d No. 1 Third musical number 42% 52% 
meee a 25 36 No, 2 First musical number 23 18 
Disaline ees No. 3 Comedy skit 21 17 
100% 100% No. 4 Second musical number 14 id 
100% 100% 
is essentially the same, but the Zevel of the trend Performances liked least: 
for the Los Angeles audience is somewhat lower No. 1 Comedy skit 56% 58% 
than that for the New York sample. The No.2 First musical number 16 = 
high points and the low points in nearly all No. 3 Third musical number 14 i 
cases are found to be practically identical. In No. 4 Second musical number 14 b 
other words, the implications of the results for ees 10075 
these two samples are very similar in that any ` 


recommendations for the improvement of the 
program, based on the Program Analyzer tests, 
would necessarily point to similar aspects of 
the broadcast. 

Differences between the results of the two 
samples are brought out rather clearly in a 
comparison of the relative appeals for each of 
the program elements as shown in Table 9. 
The M.C. (Master of Ceremonies) has the 
most appeal for the New York sample whereas 
the musical numbers had relatively the greatest 


appeal for the Los Angeles sample. 
Table 5, for the Holl 


there is, of course, no plot, but a comparison 
can be made of the listeners’ judgment of the 
musical numbers and the comedy skit. Each 


Table 9 


The Relative Appeal of Program Elements— 
New York Variety Show 


New York Los Angeles 


Sample Sample 
M.C.’s personality and humor 43% 28% 
The musical numbers 40 52 
Comedy dialogue 10 7 
Comedy skit 6 6 
Other appeals 1 7 

100% 100% 


subject was asked which performance he aes 
best and which performance he liked “ert 
The results are brought together in Table 1 a 

It will be seen that, despite some difference 
in the trend of their over-all Program Anay = 
reactions to the program and despite the difle 3 
ence in their reactions to the M.C., the seat 
samples were in close agreement with is 
to the individual performances. As is p é 
from the data of Table 10, they ranked 
entertainers they liked best in the same on 2 
and they ranked the entertainers they like 
least also in practically the same order. 


The Boston Program 


. è n icipation 
This local daytime audience-participat 


Show was developed by CBS's owned bees 
operated Boston station WEEI and had the 
heard for more than a year at the time the 
Program Analyzer tests were made. ator 
case of this comparison, therefore, the pi 
sample was exposed to a program with amp 
Some were familiar and the New York sa! ha 
was presented a program which none do 
heard previously. Both samples consiste y 
71 listeners: “Little Annie” was used for roxi- 
tests in seven successive sessions with apl Pijon 
mately 10 subjects for each. The compo nd 
of the sample with respect to sex, 28° 
education was as shown in Table 11- qnot 
Inasmuch as the New York listeners hā 


4" ^ 


Geographical Sampling in Testing Appeal of Radio Broadcasts 277 


Table 11 


Composition of the Two Samples— 
for the Boston Program 


New York Boston 
Sample Sample 
N27 N27 
Sex: — 
Male 1796 3% 
Female 83 97 
100% 100% 
Age: 
Under 26 14% 12% 
26-40 "i 27 
Over 40 45 ól 
1009 100% 
Education: * 
Grammar School 17% 21% 
High School 69 62 
College 14 17 
100% 100% 


at the program before, a somewhat different 
Ype of question was asked for comparative 
Purposes, namely, *How do you feel about 
e Programs broadcast in the daytime?" 
he Boston program featured two quizzes— 
ua quiz and a radio star quiz—as well as 
nut-d Ontests in the nature of stunts—a dough- 
s e YA contest and a cake-slicing contest. 
be, Sa feature of the program was a “travel- 
member” consisting of a search among the 
With he of the studio audience for the woman 
Contesta, most children. A total of 11 local 
inter; ants appeared on the program and were 
onte. d by the quizmaster prior to each 
Speci, The prizes consisted of merchandise. 
the e Program unit of the broadcast was 
fact ( adio Mirror" plug—a reference to the 
hat. the Program had received national 
in the August issue of this maga- 
Bram a first half of this thirty-minute pro- 
Mercialg 1 Sponsored and included three con 
read 9r a nationally advertised brand o 
he he second half was sustaining. 
Overall reaction of listeners to the pro- 
Scor, i- measured by their Program Analyzer 
Very p Was as follows: Boston listeners 33 (a 
Yor 500d rating for this type of show), New 
ISteners 25 (about average). The pro- 


am 


T 


gram thus had a stronger over-all appeal for 
Bostonians than for New Yorkers which is not 
unexpected in view of the local flavor of some 
of the content and subject matter character- 
istic of the show. This is true not only in 
terms of the over-all reaction of the two 
samples but also in terms of their reactions 
to the individual program units described in 
Figure 4. 

The two trend lines are essentially parallel. 
The highspots and lowspots of the program are 
practically identical for both Boston and New 
York listeners. The musical quiz had the most 
appeal for both samples; the doughnut-dunking 
contest and the cake-slicing contest had the 
least appeal, aside from the commercials. The 
only difference of any consequence was in the 
relative appeal of the “plug” for Radio Mirror 
magazine which had some appeal for Boston 
listeners (local pride) but was of little interest 
to New York listeners. 

The over-all reaction of listeners to the 
program, obtained from the questionnaire ad- 
ministered at the end of the test sessions, 
further established a difference between the 
reactions of the two samples, as indicated in 
Table 13. 

A majority of the Boston sample was satis- 
fied with the program whereas a third of the 
New York listeners were dissatisfied. Despite 
this fact, however, the main appeals of the 
program were similar for both samples. This 
has been seen in Figure 3 and is further con- 
firmed in the questionnaire data summarized 
in Table 14. 

Self-testing, which has been found to be a 
principal appeal in most quiz programs, was 
checked by nearly four-fifths of both groups as 


Table 12 


Opinions about Day-Time Quiz Programs 


New York Boston 


Sample Sample 
Such programs among my favorites 54% 43% 
Like them as well as other daytime 
shows 31 30 
Do not like them as well as other 
daytime shows or never listen to 
them 15 18 
100% — 100% 


218 John Gray Peaiman and Tore Hallonquist 
t Cin minutes) 
as ^ 2? 4 6 8 1o T t 4 16 is 20 22 24 ae = 28 jj 
i NUN 
Boston Listeners 

o 
z 
- 40 
z 
M 
a 
z 
< 30 
z 
v 
t 20 

te New York littonorr 

QU! meaw MEK OUI pms OMAN WR PEAS ; MOXUM ONT FUIL I Amal Hua — WA [-ITITI —z T wu (ub. Cnm QUA 
‘eeu rtr M 


com [m 


34 C 


[iy j «mum 


Fic.4. Comparison of reaction trends for Boston and New York listeners, 
Boston audience participation program. 


having been one of the items adding most to 
their enjoyment. Self-testing rated the highest 
of any element for satisfied, conditional and 
dissatisfied listeners in both New York and 
Boston samples. More than two-thirds of 
each sample derived a great deal of enjoyment 
from “hearing people have a good time" (de- 
scribed as empathy in Table 16). On the 
other hand, the two aspects of the program that 
had the least appeal were the prizes and the 
contests. 

The brunt of the success of any audience 
participation show rests largely on the M.C. 


Table 13 


Per Cent Satisfied, Conditional, 


: and Dissatisfied 
Listeners to the Boston 


Program 


and it is interesting to note that the tw? 
samples reacted similarly to the M.C. of this 
program, although again the Boston audience 
reacted somewhat more favorably probably 
because of its greater familiarity with his pet 
sonality prior to the time of the Progra" 
Analyzer test. In Table 15, four aspects b 
the M.C. are considered; the per cent of p 
sample enjoying each is indicated. The M.C- 
humor and jokes were least appreciated, ton 
even so, nearly three-fifths of the Bos 
audience found them “very enjoyable.” 


Table 14 A 
Gratifications and Appeals of the Boston Progra” 


ston 
New York Bomple 
le 
New York Boston Samp 9o 
Sample Sample Etting 71% M 
Satisfied 16% 51%, mpathy 69 59 
Conditional 20 32 i Human interest 42 58 
Dissatisfied à 34 i "i 46 56 
umor ze 
—— P 52 
100% S Prizes 31 A 
i 100% Contests 31 


sí i 


Geographical Sampling in Testing Appeal of Radio Broadcasts 279 


Table 15 


Favorable Opinions about Personality and Perform- 
ance of the Master of Ceremonies 


New York Boston 


Sample Sample 
M.C.s personality 67% 79% 
m handling of contestants 65 79 
en ee and manner of speaking 67 74 
Cs humor and jokes 46 57 


Summary 


Su eum of the New York-Boston tests 
vmi o the advisability of testing “local” 
weer with local audiences if accuracy for 
is the level of response and program appeal 
ie principal question. On the other hand, 
fut aramis producer could ascertain the 
pro a strong and weak points of the 
une i just as well from the New York 
hob as from the local Boston sample, inas- 
eee as the high- and low-spots tended to 
ade each other up to the very closing of the 
vla We are of the opinion that this 
uk pier be the case except for programs 
in = purely local in their content, orienta- 
pi and atmosphere. Other Program Ana- 
T tests have established, for example, that 


quiz questions of the local sort, that cannot be 
answered other than by local audiences, have 
little or no appeal for outsiders. If self-testing 
were not the main appeal of quiz programs, 
then this might not make such a difference, 
but self-testing has repeatedly been found to be 
the principal appeal. 

The results of the New York-Los Angeles 
tests demonstrate that national network pro- 
grams can be satisfactorily analyzed, at least 
for urban audiences, with samples of subjects 
drawn from different geographic areas. We 
are of the opinion that the Program Analyzer 
technique has a general usefulness for the anal- 
ysis of such programs regardless from what 
urban area the sample may be drawn. Un- 
familiarity of listeners with the principal char- 
acter (or characters) of a program will tend 
to affect somewhat the general level of listeners’ 
likes and dislikes, but not decisively so, particu- 
larly since the major high- and low-spots of a 
program evidently will be the same. From 
the point of view of diagnosing the appeal of a 
program as a whole and its various parts, as 
well as making recommendations for the im- 
provement of future broadcasts, this is the 
primary consideration. 


Received A pril 1, 1950. 
Early publication. 


The Effect of Color in Direct Mail Advertising 


J. William Dunlap 
Harvard School of Public Health 


There have been a number of discussions 
concerning the value of color in direct mail 
advertising. Birren presents the results of 
several studies which indicate that color pulls 
more returns than black and white in this type 
of advertising. The results of the several 
studies presented by Birren were, in some cases, 
slightly contradictory. Furthermore, perti- 
nent data needed for determining the statistical 
significance of the differences were not pre- 
sented. Among the colors Birren found to be 
best were: yellow, goldenrod, blue, and cherry- 
red. It was felt that a study should be con- 
ducted in which statistical tests could be 
applied to determine the significance of the 
results. 
The colors tested in the present study were: 
low, blue, and cherry with white as the base 
control "color." It was intended to use 
the psychological primary colors of red, blue, 
and yellow. However, since the true primary 
colors could not be obtained from the pap 


er 
manufacturer, colors were chosen that were 


as 
close to the psychological primaries as was 
possible. 


The colors blue and yellow closely 
match the primaries, but the cherry is some- 
what different. 

It is necessary that the reader be given 
some idea of the relative “brightness contrast” 
between the print used and the color of the 
cards. Paterson and Tinke 


, r stated that 
legibility and speed of reading depend upon the 
“brightness contrast” between the print used 
and the background for the print? The back- 
ground colors for the black print used in this 
study are all comparatively light, However. 
these colors approach maximum chroma or 
saturation. There is little difference in th 
"brightness white ind 


ye 
or 


contrast” between the 
yellow cards and between the blue and cher 


cards. The contrast of the print on the blue 
1 Faber Birren, 


Selling wi 
McGraw Hil Is. amg with color. New 
D. G. Paterson and M ink 
type readable. New Yorka ae Diken 


dab j : Harpei 
Ch. 10, "Color of Print and Background" Th 1940. 


York: 


280 


and cherry cards, however, is not as great as 
the contrast between the print and the white 
and yellow cards. T 
The nature of the “advertising” material w E 
a card notifying members of the Kansas Sta á 
Alumni Association that their annual member- 
ship was expiring and this was their y o 
tunity to renew it? The cards were of Oe 
ply colored Hammermill Index Bristol. Ba 
dimensions of the cards were 3X5 . ards; 
message was printed in black on all car ? 


regardless of color. 
B ~ card 
During the first week of each Hu Ei 
were sent to all members whose mem soni 
would expire that month. Every 


Im LH ive same 
person on the mailing list received the 


ls 


È than 
color or card. No person received more 
one card. st 
and Aug" 
Between the dates of Dec. 1, 1948 and Aug 


" ul. 

6, 1949, a total of 572 cards were mailed oe, 
The distribution was as follows: 147 ' 
144 yellow, 141 blue and 140 eg or 
1 shows the number of cards of eac m 
that were sent out each month and the returns 
returned. The highest percentage of ex orde" 
50.7, was for the yellow cards. The XU 
of returns for the other colors was blue 
white 40.8% and cherry 38.6%. bser ved 

The question now is whether the ° nder 
differences in returns are due to enan no dif- 
the null hypothesis it is assumed tha lor. A 
ference exists due to the effect of nt 
simple and direct overall test for ens elation 


ae 2 : alc Í 
tlon is provided by chi square. egre 
e 


l 


no f 
be accepted, and it is concluded r e use : 
ence in “pull” was found due to T 
color, 2 


s 
irren 
This result does not support Bi shout 


wit og 
cted "yum", 
? This study could not have been condu ord, [^ 


pr 
the cooperation and support of Kenny L'icbted don 
Secretary, KSC. The writer is also In" coure® 


í € 5 en 
Roy C. Langford for his suggestions an¢ 


Effect of Color in Di 


rect Mail Advertising 


Table 1 
The Number of Cards by Color Sent Out and Returned by Mailings 
Month White Yellow Blue Cherry 
Mailed Sent Ret'd Sent — Ret'd Sent Revd Sent Revd 
Dec, s 1 7 4 7 2 6 2 
Jan. 9 5 8 4 9 3 9 2 
Feb. 17 12 16 12 17 14 17 12 
Mar. 28 16 28 17 27 15 27 17 
Apr. 18 K 18 8 16 9 16 5 
May 25 6 25 1 24 4 24 2 
June 10 2 10 1 9 1 9 2 
July-Aug, 32 15 32 16 32 17 32 12 
Total 147 60 144 73 141 65 140 54 
op 
% Rev'd 40.8% 50.7% 46.1% 38.6% 


ee that color affects “pull” in direct mail 
a mr Birren reports a study done by 
or els who tested colored envelopes 
folloy 5 number of responses and obtained the 
ain results: blue 7.8%, yellow 6.8%, 
ue 6.4%, green 6.0%, pink 5.8% and 
Which 1%. He reported another study in 
cent Colors were measured in terms of per- 
alt of orders they produced. The results 
Fe bep goldenrod 21.42%, pink 17.83%, 
anda 7.82%, white 17.29%, kraft 15.8970 
i" ad envelope (color not stated) 9.75%. 
millin reports still another study done by a 

found ee pany who, testing “return cards, 
at 50.6% of the returns were cherry- 


red cards, while white and blue pulled 32.7% 
and 16.7%, respectively. 

The available evidence as to the effect of 
color in direct mail advertising is contradictory. 
It is possible that the results from the alumni 
membership cards represent a sampling error 
due to the content of the cards, the personnel 
sampled, or to the size of the sample. Ex- 
amination of all the data available gives the im- 
pression that black on yellow, buff, or golden- 
rod has a greater pull than does black on white. 
In view of the possible practical value of color 
in direct mail advertising the problem should 
be subjected to further investigation. 


Received September 22, 1949. 


Brand Discrimination among Cigarette Smokers 


C. K. Ramond, L. H. Rachal, and M. R. Marks 


Tulane University 


Tt is a matter of common knowledge that the 
essence of cigarette advertising is the claim 
that the particular cigarette is distinguishable 
Írom other brands. Habitual smokers fre- 
quently comment that they are able to identify 
their own brand. If these claims be true, it 
follows that there must be discriminable dif- 
ferences among brands, and the problem of 
ascertaining the extent of such differences is 
one of interest to psychologists. 

The writers have found only two studies 
which are immediately relevant. Husband 
and Godfrey (2) in 1934 worked with 5 differ- 
ent brands. They requested 51 Ss to attempt 
the identification of 4 cigarettes, under the 
condition that S was told only that his cigarette 
was included among the 4. The report does 
not state clearly whether the S knew just what 
brands were possible choices. Ss were blind- 
folded. The data are given in terms of per- 
centage of correct and incorrect identifications 
for each brand tested. Although no statistical 
techniques were employed to evaluate the data, 
Husband and Godfrey concluded that most 
Cigarettes tested were identified correctly 
slightly more times than would be expected by 
chance. They noticed anomalous findings, 
e.g, “Camels are identified as Chesterfields 
more often than as themselves" (2, p. 222). 

There is evidence, however, that the use of 
a blindfold obscures the central problem. Hull 
(1) in 1924, while studying the Physiological 
effects of tobacco smoking, found that his 
blindfolded Ss frequently could not distinguish 
between real tobacco smoke, and warm-moist 
air, when both were inhaled through a pipe 
mouthpiece. Those readers who are habitual 
smokers may recall that when they are smoking 
in the dark they are Sometimes not sure of 
whether they are smoking at all! 

The present investigation w 
test capacity for discrimination 
popular brands of cigarettes, t 
Chesterfields and Lucky Strike: 
allowed to see both the cigarette 


as designed to 
among various 
o wit, Camels, 
S, when S was 
and the smoke, 


ri i rere 
Answers for the following questions w 
sought: 


1. Do correct identifications exceed chance 
expectancy, and, if so, what is the margin of 
improvement over chance? 

2. Do Ss, who are permitted to smoke the 
brands interchangeably, make higher identiti- 
cation scores than Ss who are required to 
smoke a single brand until they commit them- 
selves as to its identity? : 

3. Does an S who habitually smokes a given 
brand identify that brand correctly more often 
than do Ss who habitually smoke some other 
brand? 


Procedure 


Subjects: Only smokers who consumed at 
least one pack of cigarettes in a four day cem 
participated in the study. They were a " 
number, of both sexes, accidentally samp e 
from Tulane University students. SS pow 
approached outside of classrooms, in stude! 
centers, and on streets near the campus. 

Apparatus: This consisted of mimeograp 
identification questionnaires, gummed lab 
candy mints, and most important, 1, él 
cigarettes—400 each of the brands nam, 
above. Plain gummed labels of identical p 
Were pasted in identical positions over est- 
brand names of the cigarettes used in the t ] 
smoking. Each label bore an initial% rks 
OrZ. These non-committal identifying n" 
were coded to the brand names, eM 
Security purposes, the code was changed aire 
Way through the study. The question? 
was as follows: 


hed 
els, 


«oy the 
Circle the brand name which you think i 
labeled cigarette is: 

X Y Z 
Camel Camel Camel field 
Chesterfield Chesterfield Cheste" trike 
Lucky Strike Lucky Strike Lucky ait 


vere 
Method: Two principal groups of 55 pes in 
Ployed. An attempt was made to ? 


282 


Brand Discrimination among Cigarette Smokers 


mL AR Ss who habitually smoked each of 
re e tested, and also, Ss who habitually 
eee other brands. Actually, in Group I, 
stick d 5 24 Ss who smoked Camels, 27 who 
Steike Chesterfields, 24 who smoked Lucky 
Us n and 25 smokers of miscellaneous 

nds. In Group II, the corresponding break- 


down w: z à : 
d vn was, 25 smokers for each of the categories 
escribed, 


i Smoking: Instructions given to each 

o dada I were as follows: “Light all three 
and Smoke eos (one each of the three brands) 
orand whic! p interchangeably. Notice the 
e en r e Dai smoking and look for 
ater, SIDES which will help you identify it 
to make the until you feel you will be able 
given ei the identifications. Instructions 
“Light Sal S in Group II were as follows: 
Its name ; y one of these cigarettes. Notice 
can ieee smoke it until you feel that you 
rette eae y it later. Then take another ciga- 
one Hess aid in in the same way. , Put out 

est Spe before lighting another. 

mint to. lee a Each S was given a candy 
9 clear the taste of the practice smoking. 


1e > 1 
ice pethod paralleled that used in the prac- 
i 


Were 
he G 


rchangeably. 

à choice each time they complete 

the three cigarettes. In no case in- 

mne t5 ie data, did an S use a given brand 

Cigarettes dentity more than one of the test 

*gree of s None of the Ss was told of his 
success, 


T Results 
dort, * chine the number of times each 
Other cigar as identified as itself and as some 
the fone by experimental groups. 
‘dentificati pertinent question is, “Do correct 
alons exceed chance expectancy; and 


if so 
Ww ` > 
» What is the margin of improvement over 


Table 1 


Numi 
s Identifications of Cigarette Brands by 
and Name and by Experimental Group 


Identified As 


Ae 

Rend Camel Chesterfield Lucky Strike 
Camel I m I I I I 
hest 48 40 — 24 33 26 21 
»weld 23 35 — 46 4 3 23 


chy Sic 
trike 30 24 a 27 4 50 


283 


chance?" The x? test of independence of 
principle of classification was applied to the 
data of Table 1. With 4 degrees of freedom, 
x°=32.75. A value as great as this could be 
expected by chance only 1 time in 100,000. 
The chance expectancy is of course 33.3%. 
The overall average percentage of correct 
identifications was 44.5. Thus, the margin 
of increase, while highly significant, is small 
in magnitude. The figures for individual 
cigarettes and individual groups do not differ 
significantly from the average figure of 44.5% 
correct identifications. Theactual percentages 
were: For Group I, Camels, 48; Chesterfields, 
40; Lucky Strikes, 41. For Group II, 40, 
41, and 50, respectively. It appears that, 
while no individual cigarette is more ‘‘distinc- 
tive" than any other, there is a slight but 
significant discriminability among the three 
brands tested. 

The second question is, “Do Ss who are 
permitted to smoke the brands interchangeably 
make higher identification scores than Ss who 
are required to smoke a single brand until they 
commit themselves as to its identity?" This 
question may be answered by comparison of 
Groups I and IL Group I averaged 44% 
correct; Group II averaged 45% correct. The 
difference is not statistically significant. It 
appears that this limited practice is not effec- 
tive in differentiating between the groups. It 
is quite possible that the two kinds of practice 
did not change the scores at all, so that if a 
third group had been employed in which only 
the test smoking was administered the identifi- 
cation scores would have been as high as those 
obtained in this study. It should be noted 
that the Group II conditions approximate the 
“real life” situation. Habitual smokers smoke 
only one cigarette at a time! 

The third question is, “Does an S who 
habitually smokes a given brand identify that 
brand correctly more often than do Ss who 
habitually smoke some other brand?" Of the 
Camel smokers, 75% were able to identify 
their own brand. Corresponding figures for 
Chesterfields and Lucky Strikes were 70% 
and 74%. There is no significant difference 
among these percentages. Smokers of par- 
ticular brands tested (i.e., excluding the mis- 
cellaneous smokers) identified the popular 
brands other than their own less frequently 


284 


than did the habitual smokers of those brands. 
Of the combined Chesterfield and Lucky Strike 
smokers, only 39% identified Camels correctly ; 
of the combined Camel and Lucky Strike 
smokers, only 42% identified Chesterfields 
correctly; of the combined Camel and Chester- 
field smokers, only 44% identified Lucky 
Strikes correctly. There are no significant dif- 
ferences among these percentages, but all of 
them are significantly less than the percentages 
of correct identifications of “own” brands 
given above. Miscellaneous smokers did even 
less well. Camels were identified correctly by 
only 24%, Chesterfields by 14%, and Lucky 
Strikes by 22%. Again, there are no signifi- 
cant differences among these percentages, but 
all of them are significantly less than the per- 
centages for habitual smokers of the brands 
tested. The answer to the question, “Do 
smokers know their own brands?”, is fairly 
clear. Although the overall percentage of 
correct identifications was 43.5, the “own 
brand” identifications averaged 73%; identifi- 
cations of popular brands by smokers of other 
popular brands averaged 42%; identifications 
by smokers of miscellaneous br: 
20%. There is thus a well-defined tendency 
for smokers of particular brands to know that 
brand. Smokers of miscellaneous brands score 
lower than could be expected by chance, They 
seem to have positive mis 


: à information about the 
differential tastes of the popular brands. 


ands averaged 


Summary 


of whom customarily 
ck of cigarettes each 
ked one each of three 
mels, Chesterfields, and 


- Anoth 
in practice and in test. s other 100 Ss, both 


€ brands. 
he samplin 


C. K. Ramond, L. H. Rachal, and M. R. Marks 


results is applicable only to such a population 
except insofar as the sample tested reasonably 
represents the general smoking population. 
Further, in a short-run test such as this, such 
factors as throat-irritation or, contrariwise, 
acquired insensitization, are minimized. Ina 
more extended smoking test the obtained fre- 
quencies of correct identifications might well 
change. With these restrictions in mind, the 
following conclusions were reached: 


1. All three brands tested were identified cor- 
rectly an average of 44% of the time as against 
a chance expectancy of 33.3%. The increase, 
though slight, is significant statistically. 5 

2. There were no significant differences n 
frequencies of correct identifications among 
the three brands tested. in 

3. There was no significant. difference " 
correct identifications between those SS ar 
smoked their cigarettes simultaneously cay 
terchangeably, and those who were limite a 
one cigarette at a time. The latter emi 
that which corresponds more closely to a€ the 
smoking practice. It is suggested that ntly 
training used in this study was not a dif- 
extensive or intensive to be evocative ? 
ferences in performance. 

4. Habitual smokers were able to 
their own brand significantly more ue was 
Smokers of other brands; this facility 


uniform in habitual smokers of all three Me 
tested. 

5. It would appear that claims of € e 
advertisers and habitual smokers to thè mone 
that there are discriminable differences ? put 
Various brands are technically tr" 
actually of small magnitude. — d great! 

6. No data of this study indicate gn i 
discriminability of any particular br poke 
Sreater discriminatory capacity by 
of any particular brand. 


identify 
than 


e 
jgarett 


Received October 20, 1949. 


References king o? 
mokiP? i, 
l Hull, C. L. The influence of tobacco y” onë 


chot- 
mental and motor efficiency. P524 7 ; ent 
1924, 33, 161. em, pu 
2. Husband, R W., and Godfrey, J- An p 1 


Study of cigarette identification- 
Chol., 1934, 18, 220-223. 


Report on the Journal of Applied Psychology for 1949 


.À summary of the materials published in 
Vol. 33 of the Journal of Applied Psychology is 
Presented in the following table together with 
the Corresponding data for Volumes 27-32. 


G Book 
Year Vol. Articles Reviews 
1949 33 76 33 
1948 32 78 34 
1947 31 80 29 
1946 30 67 30 
1945 29 53 18 
1944 28 54 18 
1943 27 58 9 


i humber of pages printed from 1946 
Tough 1949 is as follows: 


1946 1947 1948 1949 


No. pp. ("reg,n) 4j6 — 478 — 485 — 479 
9: pp. (“early”) 192 186 199 138 
Total 668 664 — 684 — 617 


i Vol, 33 was one page shy of the budgeted 
rit of 480 pages, Early publication contrib- 
uted a total of 18 extra articles and 138 extra 
Pages for our APA reader-owners and outside 
Subscribers, 

Early publication in 1949=18 articles; in 
Pg 21 articles; in 1947 = 28 articles; in 1946 
pu voee d ‘The lag in publication of “early 
With cation” articles ranges from 2 to 4 months 

& median of 3 months. 

k addition to printing “date of receipt” at 
aoai of each article, “early publication” is 
mation ded if such be the case. Thus all infor- 
ished Concerning editorial action on pub- 
Owner: articles is available to contributor- 
Subse 8, to reader-owners, and to outside 

ribers. 
ing 1o disposition of manuscripts received dur- 
ne 9 and in each of the three preceding 
“War” years is as follows: 


1948 1949 
Actor 1046 — 1947 


ej 67 59 s9 10 
nis 38 m 35 74 
To Ei Se SS 
: tt Received 105 10 04 — 6 
“nt Rejected 36 43 28 42 


The number of Mss. received shows a steady 
post-war increase throwing a constantly heavier 
burden on the editor and his Consulting 
Editors. 

A larger number of manuscripts would have 
been rejected had it not been for our policy of 
giving authors a chance to revise manuscripts 
in accordance with detailed suggestions pro- 
vided chiefly by our Consulting Editors. Of 
102 Mss. accepted in 1949, 61 were accepted 
“as is” and 41 were accepted following revision 
by the author. 

The lag in publication of articles published 
in regular turn varies from time to time because 
manuscripts are not received in an even flow. 
During 1949 the low point was 5 in December 
and the high points were 23 in April and 23 in 
October. The median lag for regularly sched- 
uled articles in Feb., Apr., and June 1947, for 
the same months in 1948, for the same months 
in 1949, for Aug., Oct., Dec., 1949, and for 
the same months in 1950 is as follows: 


Median Months of Lag 
Feb. Apr. Feb. Apr. Feb. Apr. Aug. Oct. Feb. Apr. 
June 1947 June 1948 June 1949 Dec. 1949 June 1950 
12 9 T 9 10 


Lag in publication is thus shown to be in 
creasing since the first half of 1949. As of 
April 15, 1950 there are 39 accepted Mss. on 
hand plus 14 Mss. accepted if revised, and 7 
Mss. “action pending,” or a total of 60 Mss. 
likely to be published in August 1950 or later. 
The estimated lag for the last few Mss. accepted 
during the first half of April 1950 is 12 months. 

The problem of lag in publication is thus 
once again’ becoming acute in spite of the con- 
tinued policy of: 1. “Brevity consistent with 
clarity.” 2. Early publication at author’s ex- 
pense. 3. Use of American Documentation 
Institute for large, unwieldy, and costly tables 
and figures. 

The APA Council of Representatives Direc- 
tive adopted at the Denver meeting in Septem- 
ber 1949 will aid the editor in persuading 
authors to abbreviate their reports and to 
make increased use of ADI. During 1949, 
seven articles used ADI. During the first half 


285 


286 


of 1950, six articles used ADI. Vigorous 
action by the editor is the only way lag in pub- 
lication can again be reduced to a desirable six 
or seven month lag. If the postwar pressure 
continues to increase and we have exhausted 
all possibilities of "brevity consistent with 
clarity" and use of ADI then the Board of 
Editors and the Committee on Publications 
will be forced to consider an increase in the 
budgeted number of pages per volume. 

The problem of the long delay between sub- 
mission of edited copy to the printer and the 
receipt of a given issue by subscribers con- 
tinues to be acute. An increased number of 
authors write in concerning their reprint 
orders because of the long time elapsing be- 
tween appearance of an article in a given issue 
and receipt of reprints, 


Donald G. Paterson 


The new cover page and the new double 
column format has been received with favor 
judging from unsolicited comments. The 480 
pages per volume of the old format with about 
500 words per page has been reduced to 384 
pages per volume in the new format but pe. 
about 750 words per page. Thus, there ha 
be a slight increase in the number of articles 
that can be published per volume beginning 
with 1950. . 

The editor again wishes to express his ap 
preciation to his Consulting Editors for in 
valuable aid in evaluating manuscripts and d 
submitting detailed suggestions for revisio! 
when such is indicated. 


Donald G. Paterson 


Editor 
Received A pril 22, 1950. 


Book Reviews 


Drake, Frances S., and Drake, Charles A. 4 
human relations casebook for executives and 
Supervisors. New York: McGraw-Hill Book 
Company, Inc., 1947. Pp. xiv-+187. $2.50. 


Tn reviewing a recent textbook on personnel 
management I was impressed by the thorough- 
xs its presentation of facts, principles, and 
The pe but felt that it was very dry reading. 
ine isi was there, all neatly organized 
tea terned, but it lacked the living tissue 

Cessary to make it come alive. 
din "human relations casebook" tries to 

flesh and blood to the formal structure of 
terme management and does it pretty well. 
— mi 75 case histories, drawn from actual 
mee and descriptive of a wide variety of 
the ice in industry. Section I, Adjusting 
line oe Resources, includes 17 cases In- 
tions p selection, transfer, emotional devia- 
» training, retirement, etc. Section II, 
veloping Attitudes and Sentiments, presents 

i ames illustrative. of factors affecting em- 
Du noH Section IIT, Using and Abusing 
Wage Wes, portrays 11 incidents dealing with 
ei Tee and non-financial rewards. 
roups V, Bargaining with Individuals and 
gre Bissenks 8 cases involving handling of 

obilizi; and labor demands. Section V, 

eals wi the Brain Power of an Organization, 
Systems, h the use of conferences, suggestion 
hrough Fn work simplification. programs 
tion VI e ns of 6 cases. The final Sec- 
visors, the Ways of Executives and Super- 
» Portrays 22 specific incidents illustrative 
MS ua and poor personal traits and behavior 
Algo C managers, 
iie o the case histories are quite brief 
round tionally "stripped of much back- 
Matter : emotionally toned descriptive 
Ciples ire focus attention on the prin- 
Vicarious] ved. To aid the reader in profiting 
ers d i from the successes and errors. of 
Manner. ? case is presented in the following 
riefh ist, the area illustrated by the case 
Presenta outlined; second, the case history 1s 
e, gen third, interpretive comments are 
fourth, eralizing from the specific incident; 
' Questions for group discussion are 


raised; fifth, the reader is asked to formulate 
in his own words the primary lesson or principle 
drawn from the case under discussion. Facili- 
tating the use of the book as a text for training 
group or classroom discussion are well selected 
and annotated bibliographies at the end of 
each of the six major sections. 

Learning (or teaching) through case ex- 
amples has obvious limitations. Solutions 
successful in one situation frequently fail in 
another and it is a common human error to 
generalize too quickly from a few instances. 
The authors seem to be aware of this in their 
deliberate cutting out of many of the details 
of the cases and in their emphasis upon the in- 
terpretation of the situation in terms of general 
principles. The principles do not grow out of 
the cases so much as the cases are illustrative of 
the principles. Looked at in this light, this 
reviewer felt it inappropriate to make what 
seemed on first reading to be obvious criticisms 
on the briefness of the cases and the failure to 
include certain expected types of situations. 

The real merit of this book lies in the 
soundness of the particular principles illus- 
trated, the value of the case examples in stimu- 
lating thought by the reader, and its general 
usefulness as a supplementary textbook or 
as a workbook in a supervisory development 
program. 

Albert S. Thompson 


Teachers College, 
Columbia University 


Mossin, A. C. Selling performance and con- 
lentment in relation. to school background. 
New York: Bureau of Publications, Teachers 
College, Columbia University, 1949. Pp. 
viii+166. $2.75. 

Do salesgirls who have completed high 
school courses in “distributive” subjects per- 
form more efficiently, and are they more con- 
tented with their work than salesgirls who 
have not had such courses? These are the two 
major questions that this study posed for in- 
vestigation. 

The subjects were 94 salesgirls from a large 
department store in New Vork City, rather 


287 


288 


homogeneous with regard to such factors rd 
period of employment, marital status, an 
duration of high school attendance prior to 
employment. They differed in having followed 
in high school what the author classified as 
either a college preparatory, commercial, dis- 
tributive occupational, or clothing arts cur- 
riculum. 

Job performance was measured by means of 
subjective criteria derived from independent 
ratings by four trained shoppers who observed 
each girl at work. Job contentment was 
evaluated by responses to three instruments 
specifically constructed for this investigation, 
namely a Job Functions Interest Blank, a Job 
Conditions Satisfaction Questionnaire and a 
Job Ranking Test. 

No significant differences were found be- 
tween the performances of salesgirls with dif- 
ferent high school curricular backgrounds. 
Some slight tendencies were evidenced for 
girls who had taken the “distributive” cur- 
riculum to score higher on measures of job con- 
tentment than did girls from the other cur- 
ricular groups. The “distributive” group ex- 
pressed a greater desire to remain in sales work 
than did the three other curricular groups. 

The study thus lends some support to other 
investigations which have indicated that in- 
terests, although not always useful as pre- 
dictors of job performance, are indicative of 
motivations important to job contentment, 
worker morale, and reduction of employee 
turnover. However, certain weaknesses in ex- 
perimental design forced upon the investigator 
by the particular department store situation 
tend to vitiate positive concl 
data. 

The salesgirls were draw 
departments of the store, 
parisons between their se 
tenuous in view of differenci 
and problems encountered i 
of the 94 subjects were classified in the *dis- 
tributive curricular Broup and six of these had 
taken only two “distributive” Courses. The 
“ane ad to rely upon subjective criteria for 

1e evaluation of job performance, The relia- 
bilities of these (ratings by shoppers) Y 
quite low and correlated 30 or low ee 
supervisors’ ratings of t S MH 


he same emp] 
j e oyees. 
Perhaps an inherent limitation În this ap 


usions from the 


n from 41 different 
thus making com- 
lling tasks rather 
es in materials sold 
n selling. Only 13 


Book Reviews 


proach to comparing workers’ high school back- 
grounds is that these do not in themselves 
reveal attitudes toward curricula and work 
that are indicative of the motivations underly- 
ing vocational choice. We must know ix 
about the meaning of vocational choice to indi- 
vidual students and workers if we are to sis 
ate properly the contributions of their beer 
grounds to job performance and contentment. 


Daniel Raylesberg 


B'nai B'rith Youth Organization 


Vernon, Phillip E., and Parry, John B. yi 
sonnel selection in the British Forces. 1949 
don: University of London Press Ltd. 194, 
324 pp. 20/net. 


This book presents a summarized on of 
the application of psychological mohon 
personnel selection in the British Navy, d wring 
Air Force, and Army Territorial Service € : oth 
World War II. The book is addressed to that 
industrialists and educators with the m the 
the methods employed during the war Fal in 
British Armed Forces may be found d in 
the peace-time selection of employees ghoul 
student selection and guidance. TRAE. 
the book the authors have attempted to yr ho 
their material as an integral part of the a ex 
field of personnel psychology. = ogia 
perience in the application of psyc e» id, in 
methods to selection problems is, eer in 
tegrated with the knowledge which € Pi per- 
the field prior to the war. Only pas udedi 
taining to personnel selection is, E desis 
applications of psychology to trainings a not 
of equipment, morale and so forth ? 
treated. ization of 

Part I is concerned with the organes em 
selection programs, the general proce "A wi 
ployed and the work of psychologis e ai 
Royal Navy, Army, Royal Air di d po 
Army Territorial Service. The wor ibed is 
military psychologists is not es o 2 
chapter on the rise of vocational Li o E 
also included. This part is design sta i 
the reader a background for mee ho!02 tl 
the organizations within which rn it ue 
Worked. Unless this part is read part, 
not be easy to read the later chapter oy? 
larly since many abbreviations are 


ign 


Book Reviews 


Appendix I, Abbreviations, can be used for 
reference purposes. 
ie in part II the authors present the highlights 
garding the applications of psychology to 
ihe. wen in the British Armed 
a excell he first two chapters of this section 
pritici l ent condensed statements of the basic 
A of vocational classification and the 
Wholefr nime selection procedures as à 
lower m Armed forces. Other chapters are 
the E ITE the biographical questionnaire, 
ing ea, lew, principles of psychological test- 
Sensan As various types of tests—intelli- 
Special ne educational, non-verbal, mechanical, 
tapete ii D and temperament tests. A 
in the rue qn is devoted to selection findings 
Sions co oyal Air Force. A chapter on conclu- 
mpletes Part II. 
live = Penn of this reviewer the authors 
e "Mir three significant contributions in 
Sented s of this book. First they have pre- 
Drocedur tief, clear description of selection 
Forces Eoy they existed in the British Armed 
mi — uring World War II. The student of 
Come h; end personnel psychology will wel- 
orm ps this material in a readily available 
mary bps the book provides a good sum- 
Personnel the applications of psychology to 
chapters selection before the war. The 
e lene the biographical questionnaire, the 
testing ing and principles of psychological 
hird, th € particularly good in this respect. 
critical] € results obtained during the war are 
thowleq,, evaluated and integrated with the 
a sense c Which existed prior to the war. In 
textbook ‘he book almost impresses one as à 
cing a q ìn personnel psychology rather than 
aPplicati escription of war experiences with the 
ion Ons of psychology to personnel selec- 
he book is by no means limited to a 
War Eo of findings. The contributions of 
Psycholo G to the total field of personnel 
ese ex 8y and the peace-time applications of 
Y the oan are continually kept in mind 
uthors, 


px. reads this book he is impressed with 
loge d of findings in the British and 
Which ty Armed Forces. The types of tests 
feries. Worked” were much the same, the 1n- 
the erit Was beset with identical difficulties and 
Well pn Was a problem for the British as 


Merican psychologists. T horough 


descrip ti 


289 


follow-up studies were equally few in number 
in the armed forces of the two countries. 

The serious student of personnel psychology. 
may be disappointed to find that data are 
presented only in summarized form. It is 
stated by the authors that they hope more 
detailed reports can be published elsewhere. 
The book is well written for its intended audi- 
ence and will be widely read by personnel 
psychologists, personnel administrators and 
guidance workers. It will be useful as a refer- 
ence work or supplementary text in courses 
concerned with personnel selection. 


Dewey B. Stuit 
State University of Iowa 


Warner, L. W., Gardner, B. B., Henry, W. E., 
and Haggard, E. A. Identifying and devel- 
oping potential leaders. New York: Ameri- 
can Management Association Personnel 
Series No. 127. 1949. Pp.26. $0.75. 


This issue contains a series of four papers 
presented at the AMA Mid-winter Personnel 
Conference with the sub-title “Social Science 
and the Management of American Business— 
A Report to Management from the University 
of Chicago.” : 

The participants, and the titles, were: W. 

Lloyd Warner: “Individual Opportunity—A 
Challenge to the Free Enterprise System"; 
Burleigh Gardner: “Conserving and Develop- 
ing our Human Resources”; W. E. Henry: 
“Identifying the Potentially Successful Execu- 
tive"; and E. A. Haggard: “Social and Psycho- 
logical Factors in Work Adjustment." 
While it well may be that American industry 
is satiated or even confused by an inundation of 
“challenges,” these associates of the University 
of Chicago’s Committee on Human Develop- 
ment unhesitatingly throw out several more. 
Warner describes the uncomfortable situation 
that the usual social mobility routes upward 
(occupation and education) to higher status 
positions are becoming inadequate and indeed 
are closed in many areas. The resultant is a 
sort of a mass frustration, a willingness to give 
up the old system of individual mobility, and a 
readiness to blame the system. 

Gardner discusses the overall problem of loss 
of satisfaction and deteriorating morale as 
brought about through such organizational 


290 


problems as technological change, sane 
and over-extended hierarchy, decline of small, 
successful, satisfying, owner-managed firms, 
and the effects of specialization. Paradoxical 
to the apparent purpose of the papers (as 
revealed in the publication title) is “See that 
the good men in your organization are tond 
and give them a chance to use what they have 
. 12) 41), 
Moe um out the problem of the indi- 
viduals who run the organization—the leaders. 
He ascribes the qualities of executive leader- 
ship as being to a great extent the qualities of 
individual personality. Personality research, 
according to Henry, among successful men in 
business and industry has revealed a pattern 
of common personality characteristics and that 
this is a pattern of fairly long standing. Ex- 
ecutive skills may be equal but the funda- 
mental personality organizations may be such 
that one executive is successful and another 
isn't. He regards the executive as a "par- 
ticular kind of personality” worthy of per- 
sonality analysis as used elsewhere. For illus- 
tration he indicates that the successful execu- 
tive thinks in terms of the job hierarchy con- 
tinuously, and has a positive reverence for a 
competent parent image. But in general he 
casts research questions rather than gives leads 
to identify potentially successful executives, 
Haggard relates the above points to the 


necessity of satisfying the run-of-the-mill 
workers. He is par 


ticularly critical of "mecha- 
nized selection” by means of test scores where 
the factors of interests, motivations, and 
personality structure are not estimated at the 
hiring. (He speaks as if most workers got 
employed because of test scores whereas only 
about 15% of U.S. Companies use any test at 
all) He appeals for greater management rec- 
ognition of human relationships on the job, of 
emotional. needs, of the complete individual 
Um Again, “We need to locate the 
men who can rise in t izati 
Fl a he organization, and help 
_ All this is important and interesting But 
it doesn’t add up to “identifying and develo 
ing potential leaders,” The audience al 
doubtedly was stimulated to Support research, 


Book Reviews 


activities, and efforts toward this end—which 
is all to the good. 
° Ralph R. Canter, JE 


University of California at Berkeley 


Robert Hoppock. Group guidance; BENER 
techniques, and evaluation. New y ork: A 5 
Graw-Hill Book Co., 1949. Pp. 393. $3.73. 


Group Guidance is not a book for psychol- 
ogists. As Dr. Hoppock states in his prefac : 
"this book has been written for the eA 
(teacher) who has been assigned NOE e 
sponsibility for group guidance and who W a : 
to know what it is all about." To those be 
object to the term group guidance, as does nt 
reviewer, Hoppock states, "The amd dia 
this book has no particular desire to deba = 
the issue. . . . The author prefers the e 
‘group guidance’ because it is short, ively 
and descriptive and because it is extens! 
used in guidance literature. . . ." hers 

The book should be a useful one to edd 
and administrators in secondary erm it 
It is pitched well as to level, it Lgs tions 
contains many answers to "how" ques hich 
and it makes none of the naive claims tions: 
have marred so many similar pea nis 
Hoppock begins with brief definitions sas X 
terms but, wisely, does not spend Pee eal 
hashing the history of the guidance pii 
He relates the instructional, group app ively 
to the individual counseling process "e eal 
and well. He states the functions of the res 
approach in the first chapter and adhe 
them for the rest of the way. "E 

Part IL, Techniques, is straightforw? ollow” 
clear. He considers the techniques 9 up com” 
up, visits to jobs and institutions, £I? confe 
ferences, student survey of jobs, essit 
ences, laboratory investigations, self ™ ethod* 
ment, and a number of less important vit , Ue 
The reviewer takes general issue sure ‘i 
author only in the realm of self meas stale 
In justice to Dr. Hoppock it must a ^ 
that his approach is a cautious 0 review 
Sad personal experiences leave some d ii 
doubts as to how cautious all readers 
Guidance will be with test scores- 


AMT 


E 


Book Reviews 


Part MI, Evaluation, is excellent. Dr. Hop- 
p. we in pertinent research and applies 
bdi = n in language which should hit the 
tion rs Lt his intended audience. This sec- 
inris cj. be required reading for those stu- 
that E P bored education „who believe 
J «à ung is possible by talking a/ groups 
me e. Tt will also be helpful to whose 
E i es clinical psychology who believe 
dealin worthwhile results must come from 

E with the individual. 


291 


Part IV, Appendixes, is a sound contribution 
to teaching methodology. Dr. Hoppock has 
been very successful in using the methods 
which he describes in these appendices and 
reading them convinces the reviewer that most 
of us can learn from what he has presented. 

A sound, well written book, objective in pres- 
entation, and cautious in approach which 
should be received well in educational circles. 


Milton E. Hahn 


University of California, Los Angeles 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 
Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


An index of nomograms. Douglas Payne Adams, Editor. 
New York: John Wiley and Sons, Inc., 1950. Pp. 
174. $4.00. 

With brushes of comet's hair. Cornelia H. Bogert. New 
York: Exposition Press, 1950. Pp. 165. $5.00. 

A history of experimental psychology. Second edition. 
Edwin G. Boring. New York: Appleton-Century- 
Crofts, Inc., 1950. Pp. 777. $6.00. 

Readings on modern methods of counseling. Arthur H. 
Brayfield. New York: Appleton-Century-Crofts, 
Inc. 1950. Pp. 566. $5.00. 

College psychology. Warner Brown and Howard C. 
Gilhousen. New York: Prentice-Hall, Inc., 1950. 
Pp. 485. Cloth, $3.75; Paper, $2.85. 

Marriage analysis. Harold T. Christensen. New 
York: The Ronald Press Co., 1950. Pp. 510. $4.50. 

Experimental designs. William G. Cochran and Ger- 
trude M. Cox. New York: John Wiley and Sons, 
Inc., 1950. Pp. 454. $5.75. 

Abnormal psychology and modern life. James C. Cole- 
man. Chicago: Scott, Foresman and Co., 1950. Pp. 
600. $4.50. 

Readability. Edgar Dale, Editor. Chicago: National 
Council of Teachers of English, 1949. Pp. 44. 
$.60 per copy. $.50 each for 10 or more. 

Child guidance approach to juvenile delinquency. Eugene 
Davidoff and Elinor S. Noetzel. New York: Child 
Care Publications, 1950. $4.50. 

Proceedings of second annual meeting of Industrial Rela- 
tions Research Association. Milton Derber, Editor. 
New York: Industrial Relations Research Associa- 
tion, 1949. Pp. 299. Annual subscription to IRRA 
Publications $5.00. 

Rating employee and supervisory performance. M. Joseph 
Dooher and Vivienne Marquis, Editors. New York: 
American Management Association, 1950. Pp. 192. 
$3.75. 

The organization of mental abilities. Jerome Edward 
Doppelt. New York: Bureau of Publications, Teach- 
= College, Columbia University, 1950. Pp. 86. 

2.10. 

Guidance services in smaller schools. Clifford P. Froeh- 
lich. New York: McGraw-Hill Book Co., Inc., 1950. 
Pp. 352. $3.75. 

Guidance testing. Clifford P. Froehlich and Arthur L. 
Benson. Chicago: Science Research Associates, 1949. 
Pp. 112. $1.00. , 

Psychology. Henry E. Garrett. New York: American 
Book Co., 1940. Pp. 323. $3.00. 

Fundamental statistics in psychology and education. 
Second edition. J. P. Guilford. New York: Mc- 
Graw-Hill Book Co., Inc., 1950, Pp. 633. $5.00. 

—— icm. s Edwin R. Guthrie and Francis 
". Powers. New York: The R ~o. 105 
Pp. set. Bilbo. onald Press Co., 1950. 

Counseling adolescents. S. A. Hamrin and Blanche B. 


292 


Paulson. Chicago: Science Research Associates, 
1949. Pp. 380. $3.50. . 
The handbook of child guidance. Ernest Harms, Editor. 
New York: Child Care Publications, 1950. Pp. 751. 

$6.00. 

Industrial psychology. Thomas Willard Harrell. New 
York: Rinehart and Co., Inc, 1949. Pp. 462. 
$4.25. 

Personality, development and assessment. Charles M. 
Harsh and H. G. Schrickel. New York: The Ronald 
Press Co., 1950. Pp. 518. $5.00. 

The organization of behavior. D.O. Hebb. New York: 
John Wiley and Sons, Inc., 1949. Pp. 335. $4.00. 

Situational factors in leadership. John K. Hemphill. 
Columbus: Bureau of Educational Research, Ohio 
State University, 1949. Pp. 136. $3.00, cloth; 
$2.50, paper. 

A miniature textbook on fecblemindedness. Leo Kanner. 
New York: Child Care Publications, 1950. Pp- 33. 
$1.25. 


Gestalt psychology, its nature and significance. David 
Katz. New York: The Ronald Press Co., 1950. 
$3.00. 


Mental tests in clinics for children. Grace H. e 
New York: D. Van Nostrand Co., Inc., 1950. FP. 
180. $2.45. 

The biology of human starvation. 
Minneapolis: University of Minnesota Press 
Pp. 1360. $24.00. e 

The yearbook of psychoanalysis. Sandor Lorand, * ities 
aging Editor. New York: International Univers! 
Press, Inc., 1950. Pp. 317. $7.50. 

The first two decades of life. Frieda K. Me! 
Ralph V. Merry. New York: Harper and B 
1950. Pp. 581. $3.75. + 

Juvenile delinquency, modern society. Marun H. 
meyer. New York: D. Van Nostrand C9» 
1050. Pp.335. $3.75. p New 

Social psychology. Theodore M. New 
York: Dryden Press, 1950. Pp. 800. $4.50. 

Child development. Willard C. Olson. Boston: 
Heath and Co., 1949. Pp. 430. $4.00. | porte 

An introduction to thera peutic counseling. E. Pp- 25 
Boston: Houghton Mifflin Co., 1950. 
$2.75. r 

Human ability. C. Spearman and Ll. MW 
De York: The Macmillan Co. 2 
2.50. t 

Criminology. Revised edition. Donald R- ree N 


York: ) i 50. Pp. ^ vid 
ork: The Macmillan Co., 195! id pav act 
l awt 


Ancel Keys, ct ig, 


py and 
rother 


p. © 


Management behavior and foreman @ Law 
t v Law? op 
Ulrich, Donald R. Booz, and Paul R- gnist ati? 
Boston: Graduate School of Business ^V 15. jon” 
Harvard University, 1950. Pp. 56- "m 


Report of the Proceedings of the Seco" 


Social 


New Books, Monographs, and Pamphlets 293 


rye for the Education of Maladjusted Children. 
be van Houte and Berthold Stokvis, Editors. 
E: erdam, Holland: Systemen Keesing, Ruysdael- 
raat 71, 1950. Pp. 448. $4.50. : 
ae in A merica. W. Lloyd Warner, Marchia 
Reg er, and Kenneth W. Eells. Chicago: Science 
aang Associates, 1949, Pp. 274. $425. 
Sen Paul A. Zahl. Princeton: Princeton Uni- 
Y Press, 1950. Pp. 576. $7.50. 


um s 
an behavior and the principle of least effort. George 


Kingsley Zipf. Cambridge: Addison-Wesley Press, 
Inc., 1949, Pp. 573. S 

Directory of guidance agencies. Ethical Practices Com- 
mittee of the National Vocational Guidance Associa- 
tion. St. Louis: Dr. Nathan Kohn, Washington 
University, 1950. $1.00. 

Revere Safety Test. Revere Copper and Brass, Inc. 
Chicago: Science Research Associates, 1949. Test, 
20 pages. Handbook, 60 pages. Test, $.30; Hand- 
book, $.60. 


Subscription Lists of the 
American Psychological Association 


MEMBERS AND AFFILIATES 
Approximately 9,800 names 
The American Psychological Association main- 
tains an address list of its members and affiliates, 


which is for sale providing the nature of its use is in 
conformity with the purposes of the Association. 


1950 Prices 


Envelopes addressed ..,.............luuuuu. $35.00 
(advertiser furnishes envelopes and pays express charges) 
Addresses on tape, not gummed ............ $17.50 


(suitable for a mailing machine) 


STATE LISTS 
Priced according to number of names wanted 


SUPPLEMENTARY LISTS 


Approximately 3,500 names in total list 
Individual journal lists vary from 400 to 1,600 


The Association also maintains a list of subscribers 
who are not members of the Association (universi- 
ties, libraries, industrial laboratories, hospitals, other 
types of institutions, and individual subscribers). 
'The general list for all journals includes all types. 
Each single journal has a more specialized circulation. 


For any one journal, envelopes addressed .... $15.00 
For any one journal, addresses on tape... $10.00 
For all journals, envelopes addressed .... $20.00 
For all journals, addresses on tape ........... $15.00 


For further informa tion, 


write to 
American Psychological Association 


1515 Massachusetts Avenue Northwest 
Washington 9,.D. G, 


í 
M 
" 


Journal of Applied Psychology 


Vor. 34, No. 5 


OcroBER, 1950 


An Aptitude Test for Veterinary Medicine * 


William A. 


. Owens 


Iowa State College 


Particularly since the war, schools of veteri- 
nary medicine have been able to enroll for 
training only a relatively small percentage of 
their applicants. An increased emphasis upon 
the importance of identifying the best-qualified 
candidates has resulted in a careful reexamina- 
tion of selection procedures and in recognition 
of the possible utilization of some sort of 
veterinary aptitude test. 

Accordingly, the problem of the present in- 
vestigation was to discover or to develop an 
efficient predictor, or predictors, of scholastic 
Success during the first professional year of 
veterinary training. 

Preliminary findings are based upon the 
records of all (N=133) freshmen and sopho- 
mores who were enrolled in the School of 
Veterinary Medicine at The Iowa State College 
during the academic year 1947-48. 

Validation findings are based upon the 
academic records of 150 pre-veterinarians 
tested and. subsequently enrolled in Veterinary 
Medicine at either Cornell University (N =25), 
Michigan State College (N=41), Kansas 
State College (N —49), or Iowa State College 

N —35) during the academic year 1948-49. 


Method 


Indices already available were examined as 
to their predictive utility. These included: 
high school academic average, pre-veterinary 
College average, grade in certain specific pre- 
Veterinary courses, scores on the ACE psycho- 
logical examination, and sub-test scores on 
Form 20 of the Moss Aptitute Test for Medical 

rofessions (3). 

Since none of the above proved to be a 
ighly Satisfactory predictor of academic 
* The writer wishes to acknowledge the invaluable 


co 3 " 
Unsel and assistance of Dr. James E. Wert. 
295 


success, it was decided to construct four special 
purpose tests. Two of these were 50-item 
achievement tests, constructed with the assist- 
ance of the departments concerned, and over 
the content of the two most predictive pre- 
veterinary courses—chemistry and zoology. 
The remaining two were, respectively, 60 and 
50-item aptitude tests designed to measure the 
same abilities as the most predictive pair of 
sub-tests in the Moss ATMP. The first, 
called “Paragraph Comprehension,” is a read- 
ing test; and the second, designated as “Verbal 
Memory," involves the timed study of standard 
selections with a subsequent objective test 
upon accuracy of recall. Both are entirely 
new, and their content was judged by members 
of thal staff at Iowa State to be representative 
veterinary content. The usual psychometric 
procedures relative to establishment of time 
limits, analysis of items, and general revision, 
were employed following a cross-sectional ex- 
perimental study of the four tests in 1947-48 
and prior to a longitudinal study of their 
validity in 1948-49. The chemistry and 
zoology achievement tests were combined sub- 
sequent to the 1947-48 study, since their inter- 
correlation approached their reliabilities; the 
composite, shortened to 80 items, was simply 
called *Pre-veterinary Achievement." It may 
be noted that none of these test results have 
been employed in selection and that the data 
are, in this regard, uncontaminated. 

In the validational study, the criterion 
adopted for the evaluation of veterinary apti- 
tude was that of academic success during the 
first semester or first two quarters of pro- 
fessional training. A total grade-point average 
for this period was employed for three reasons: 
(1) such an average is highly correlated with 
ultimate scholastic standing; (2) academic 


D! 


296 William 

mortality is heaviest at this time; and (3) it 
seemed to be the longest waiting interval in a 
longitudinalstudy compatible with the urgency 
of the case. As these data were received from 
the several cooperating schools, each subject's 
grade-point average was assigned a standard 
score value in a distribution for the institution 
from which it had been obtained. 

Statistical analysis of these data follows the 
usual correlational form with the exception 
that a discriminant function style of analysis 
has been employed in obtaining estimates of 
the weights to be assigned to each of the several 
new tests in order to produce composite scores 
which maximize the differences between certain 
performance groups. In addition, multiple 
biserial correlations have been employed at 
one point, after a method described by Wert 
(4), to estimate the combined effects of the 
tests in predicting the dichotomous criterion 
of performance. 


Results 


The primary results of this investigation 
have been summarized in four tables. Prior 
to some systematic comment upon them, it 
seems appropriate to observe that the “cor- 

rected” odd-even reliability of the composite 
or total test score derived from the several 
weighted sub-test scores was 0.88. This 
estimate is based upon results obtained from 
the entire tested population of 424 candidates 
for admission into veterinary training at the 
previously named cooperating institutions. 
Total test reliability thus seems reasonably 
satisfactory.! 

To proceed, in Table 1 are shown the correla- 
tions between various predictors and freshman 
veterinary average. Under “Existing Indices” 
it is interesting to note that pre-veterinary 
chemistry average appears to be the best single 
predictor. A possible artifact involved is the 
relatively large variance in chemistry grades 
as compared with that typical of other pre- 
veterinary subjects. 

Sub-tests three and six of the Moss ATMP 
Suggested functions to be measured by the two 
new aptitude tests, and the former may be 


broadly considered as models for the latter. 
TAs 


A. Owens 


Table 1 
Correlations of Various Predictors with Freshman 
Veterinary Average 
(Preliminary Study, N = 133) 


Variable r 


Existing Indices 


Total pre-veterinary average A0 

Pre-veterinary chemistry average AT 

Pre-veterinary zoology average EI 

Raw score on ACE .02 
Moss ATMP Subtests 

Visual Memory A3 


Memory for Content 13 


Comprehension and Retention .22 
General Information —.11 
Vocabulary —.01 
Understanding of Printed Material 38 
Application of Principles AT 
Logical Reasoning .02 
Four New Tests 
Chemistry Achievement Test 27 
Zoology Achievement Test A2 
Paragraph Comprehension AT 
Verbal Memory 5T 
Sum of Paragraph Comprehension and Verbal 
Memory .62 


* At 5 per cent level r = .17; at 1 per cent r = 22. 


As previously indicated, the “Four New 
Tests" were reduced to three following the 
evidence derived from the preliminary study 
and thru the combination of the chemistry and 
zoology achievement tests to form the revise 
Pre-veterinary Achievement Test. In spite 
of its apparently poor validity, this content Mes 
lentalively retained because it was recognize” 
that the selection process at Iowa State, sm 
volving heavy emphasis upon pre-vetente a 
success in chemistry and zoology, pere 
about a tremendous restriction in range is 
talent on these tests which might not Ra 
duplicated at other institutions. In the =) : 
that it were not, it was felt that they CO 
conceivably be of substantial value. E" 

From Table 2, then, it is apparent that dicts 
of the three new tests much better ps 
academic inability from a relatively low cu oth 
score than academic standing from ka Qe 
range of scores (this admittedly within sie 
successful group). As evidence, the pro i 
moment correlations in the left-hand CO 


LUN 


An Aptitude Test for Veterinary Medicine 


Correlations of Test Scores with Grade-Point Average (Validational Study, N = 150) 


i. Table 2 r 


297 


P.-M Tetrachoric r 
} Correlation* (G.P.A.—4 vs. 3) 
Pre-veterinary Achievement (Xi) 0.24 0.41 (31%) 
Paragraph Comprehension (X2) 0.36 (0.47) 0.56 (25%) 
Verbal Memory (X3) 0.17 0.26 (28%) 


* 866 r = 0.16 and 1% r = 021. 


| Table 3 


Multiple Biserial Correlations and Discriminant Test Weights (Validational Study, N — 150) 


Multiple Biserial r’s Weights in the 
(G.P.À.—i vs. D Discriminant Function 
X: and X; (all students) 0.37 
nd X; (within schools) 0.43 v = 2,05 X2+ Xs 
X3, X» and X; (all students) 0.40 
X3, X: and X; (within schools) 0.45 (0.57) v = — 0.58 Ni + 2.13 Xs + Xs 


are consistently and substantially smaller than 
the tetrachoric correlations? in the right-hand 
column, These latter were obtained by arbi- 
| trarily dichotomizing grade-point averages at 
their mean, and by then dichotomizing test 
Score distributions at the points of minimum 
error in classification, at or slightly above the 
lowest quartile point. The percentage to the 
right of each coefficient is that below the cut- 
ting score in the truncated tail of the test 
distribution in question. 

Since it appeared to be most efficient in each 
case to set cutting scores on the tests near the 
25th centile, it was arbitrarily decided to 
i break the distribution of grade-point averages 
| a the same level and to attempt to determine 

10w best to weight each test to maximize the 
ifferences between these two segments of the 
criterion, 
| Thus, in Table 3, are shown the multiple 
| Iserial correlations and discriminant-function 
| Weights based upon this proposed dichotomy 


chotomized composite test scores and 


below some accepted grade standard? 


in the criterion, It may be noted that the 
Correlations for “all students" are considerably 
Smaller than are those “within schools.” This 


scores might rank him in the highest quarter 
at school A and in the lowest at school B. 
The second series of discriminant function 
weights in Column 2 are those which were 
employed to derive a single series of composite 
or total scores, the “V scores,” for all tests. 

Table 4 is a summary table. In it appear 
the tetrachoric correlations between the di- 


the 


dichotomized criterion. The per cent to the 
right of each coefficient, again, indicates the 
proportion of cases below the minimum error 
cutting score in the tail of the test distribution. 
Final results were cast in this form to make 
them coincident with the form of the practical 
question as it always arises, is a given subject ` 
above or below some critical test score, and 
how does this argue for his being above or 


]t is to be regretted that the number of 
cases for each institution is not larger, although 
application of the chi-square test to these data 
suggests that the least significant tabled rela- 
tionship surpasses the 2% probability level. 
A sort of increase in numbers may, of course, 


| 
| 15 no doubt attributable to test-wise institu- — be achieved by ignoring institutional differences 
by tional differences so large that an individual's and combining the data. This, naturally, 
a} " lowers the apparent degree of relationship; in 


esire et al. (1). 


Magni : ing diagrams Pigs Š E s 
Of Chegritudes estimated from the computing Biss this instance it results in a single r; of 0.52. 


298 William 


A. Owens 


Table 4 


Tetrachoric Correlations Between Composite Test Scores and Grade-Point Average (Validational Study) 


Iowa Kans. Mich. 
Cornell State State State 
Univ. Coll. Coll. Coll. 
N-25 N-35 N=49 N-4 
High $ vs. Low 1 0.72 (16%) 0.70 (1295) 0.62 (18%) 0.48 (12%) 
(Criterion) 
Interpretation magnitude. For example, in Table 2, the 


In interpreting these results several con- 
flicting influences must be recognized. The 
estimates of test-criterion relationship provided 
in Table 4 may be thought of as overestimates 
for at least two reasons. First, they are 
based, after the fact, upon the most efficient 
test cutting score; whereas, in practice it may 
be impractical to find or to employ such a 
value, which will in any case show sampling 
fluctuations. Second, the scoring weights 
established for the combination of sub-test 
scores were derived from a composite of four 
populations and then applied to this same com- 
posite population. A “shrinkage” in discrim- 
inative efficiency must be expected when these 
weights are applied to the scores obtained from 
a new population, although the customary 
effect may be minimized in this instance since 
the original sampling was of a broader and 
more heterogeneous group than could have 
been obtained at a single school. 

Running counter to these two influences, 
which would tend to make the values of Table 
4 appear to be overestimates, is the undoubted 
fact that the coefficients shown have been de- 

` pressed by a marked restriction in the range 
of talent existing within the validational group. 
At some institutions, the standard deviation of 
composite test scores is 25 to 30 per cent larger 
in the distribution of candidates for admission 
than in the distribution of selectees. This is a 
fact mainly attributable to current selection 
on the basis of pre-veterinary grade-point aver- 
age—a test-correlated variable. If Kelly's (2) 
correction for homogeneity were applied to the 
present data to obtain estimates of test- 
criterion correlations within the population of 
candidates, many of the relationships here re- 
ported would be substantially increased in 


Paragraph Comprehension test would correlate 
0.47 with the criterion instead of 0.36; and, in 
Table 3, the final multiple correlation would be 
0.57 instead of 0.45. 

In addition, there is evidence to suggest that 
within the groups admitted to veterinary 
training, those who had taken the tests, and 
who composed slightly more than half of the 
total number, were superior in performance to 
those who had zol taken the tests. At least 
a partial explanation is that, almost without 
exception, those tested had had their pre- 
veterinary training at one of the four coopera- 
ting institutions. By and large students with 
this background make better grades and fail 
less frequently. 

In evaluating these conflicting influences the 
writer is disposed to judge it as not unlikely 
that they approximately cancel and offset 
each other, and that the estimates of relation- 
ship reported are, therefore, not grossly in error. 


Summary 


A veterinary aptitude test has been devised 
which had the following characteristics 1n the 
populations studied: 


(1) It had a reliability of 0.88. à 

(2) It had tetrachoric validity coefficients 
of from 0.48 to 0.72, against a grade-point 
average criterion. ied 

(3) It was a better predictor of the specifie 
criterion than were pre-veterinary grades, 
singly or collectively, or the ACE. . 

(4) It predicted most efficiently, witht 
validational group, from a relatively 
cutting score. 


n the 
low 


Received October 24, 1949. 


—M 


An Aplitude Test for Veterinary Medicine 299 


References 


1. Chesire, L., Safir, M., and Thurstone, L. L. Com- 
puting diagrams for the tetrachoric correlation 
coefficient. Chicago: The University of Chicago 
Bookstore, 1933. 

2. Kelley, T. L. Statistical method. New York: Mac- 
millan, 1923, pp. 223-228. 


3. Moss, F. A., and Hunt, Thelma. Aptitude test for 
medical professions. Prepared especially for the 
War Department by the authors and under a 
committee of the Association of American Medi- 
cal Colleges. 

4. Wert, J. E. Unpublished manuscript available from 
the author, Iowa State College. 


Minnesota Psycho-Analogies Test * 


Abraham S. Levine 


Human Resources Research Center, Lackland Air Force Base, San Antonio, Texas 


This project was undertaken with the general 
objective of developing an evaluation instru- 
ment for psychology students. Since it was 
hypothesized that achievement in advanced 
psychology courses is to a large extent a func- 
tion of a complex of general academic ability 
and previous psychological background, a test 
was developed to provide a composite measure 
of these factors. This test comprises items of 
the four-alternative multiple choice variety in 
analogy form. The first part of each item 
contains general vocabulary and information. 
The latter part of these items consists of a 
broad sampling of psychological terms, con- 
cepts and expressions which attempt to sample 
as widely as possible the content of all the 
major fields of psychology. The response 
alternatives are all psychological in content. 
Thus the first two terms of the analogy are of 
a general nature, usually non-psychological in 
content; while the third and fourth terms are 
psychological in character. An example of 
this type of item, with the correct response in 
‘italics, is as follows: 


Orchestra: Violinist: : Test: (1. Battery, 2. Item 
Analysis, 3. Item, 4. Validity) 


Tt was believed that a special analogies test 
of the type described would serve the dual 
function of selection and general evaluation, 
depending upon the point in a student's career 
at which it was administered. More specifi- 
cally, it was anticipated that the test would 
serve some or all of the following purposes: 


1. Selection of students for certain advanced 
courses or for certain sections of advanced 
courses in psychology. 


2. Selection of graduate majors in psy- 
chology. 


* This article is based on the writer’s Ph.D. i 

s bas .D. th 
done under the direction of Prof. Donald G. Paterson 
and entitled “A Psycho-Analogies Test as an Evaluation 
Instrument for Psychology Students," completed in 


December, 1949, and on file in the Uni ik A 
sota Library. niversity of Minne- 


3. Selection of applicants for special training 
programs such as the Veterans Adminis- 
tration Clinical Psychology Program. 

4. General evaluation of professional fitness 
of a student completing requirements for 
a degree. 

5. Measure of growth of student by utiliza- 
tion of two equivalent forms of the test 
at different stages in his training. 


Preliminary Editions of the Test 


The first edition of the test, called “Psycho- 
logical Analogies," was exploratory in nature. 
It consisted of 100 items. This test correlated 
.60 with combined midquarter and final ex- 
amination scores for 92 students in a Senior 
College class in Vocational and Occupational 
Psychology at the University of Minnesota. 

'The second or preliminary edition of. the 
test was given the title *Psycho-Analogies 
It consisted of a total of 232 items, divided 
into two forms of 116 items each. While some 
of these items represented the more discrim- 
inating items of the original test, or modifica- 
tions of these, the bulk of them were still new 
and untried. These new items were on the 
whole more carefully constructed, and greater 
attention was given to balancing subject matter 
content. The test was administered to severo 
sections of Vocational and Occupational Psy 
chology and Individual Differences at the nr 
versity of Minnesota. These two courses are 
representative of intermediate and advance 
courses in psychology in which juniors, senior 
and graduate students are enrolled. A we 
of 161 cases was obtained on Form A and m 
cases on Form B. Correlation of a single for K 
with combined midquarter and final object" 
examinations in the various sections rango 
from .51 to .74. The available data also ^ 
gested that Psycho-Analogies was & sligh an 

better predictor of course achievement P 

either a general analogies test (Miller Aag os 

Form A or B) or a specially designed ach! 
300 


—MMÉÉÁii:ü 


Minnesota Psycho-Analogies 301 


ment pretest. Moreover, a tendency was 
noted for the correlations between Psycho- 
Analogies and course examination scores to be 
higher in the more advanced sections. 

In order to meet the need of a brief pretest 
for purposes of screening or sectioning certain 
psychology courses, a short form of Psycho- 
Analogies was developed, consisting of the 50 
most discriminating of the 232 items. In those 
classes in which the test was not used as a 
basis for sectioning, so that the whole range of 
talent was available for purposes of validation, 
the correlations between the short form and 
combined midquarter and final examinations 
ranged from .41 to .68. 


Minnesota Psycho-Analogies 


The final edition of the test, Minnesota 
Psycho-Analogies, consisted of 150 items 
divided into two forms, A and B, of 75 items 
each. In the selection of these 150 items, item 
analysis data for internal consistency and 
difficulty of the preliminary forms were utilized. 

In constructing the two forms a strong en- 
deavor was made to balance them with regard 
to both difficulty and subject matter content, 
order to obtain two equivalent forms. Four 
additional sample problems were included in 
both forms, making a total of eight items in 
Cach of the fore-exercises. This was done in 
Order to minimize possible practice effect. In 
Preceding editions of the test, a liberal time 
limit was allowed permitting almost all students 
to finish the test. No time limit was imposed 
m Minnesota Psycho-Analogies, in order to 
“void even more stringently whatever negative 
Influences the time factor may have on the 
Scores of an analogy-type test. The responses 
to either form are indicated on a standard IBM 
answer sheet which may be machine scored. 

One limitation should be pointed out: the 
Population sampled for Psycho-Analogies dif- 
ee in certain characteristics from the popula- 
mu e which Minnesota Psycho-Analogies 

. be used. Practical considerations deter- 
poi the number and availability of sub- 
a». made it impossible to obtain the most 
xm oe tiate group for the purpose of initial 
She of items. However, the item diffi- 
if,» Was somewhat higher than 50 per cent, 

"PPropriate correction is made for chance 


success. Since the instrument would eventu- 
ally be used mostly on graduate students, who, 
as a group, are superior to the original sample, 
it was considered desirable to have the average 
item above the 50 per cent difficulty level. 

In administering these forms the major con- 
cern was to obtain adequate normative data for 
graduating seniors and various levels of gradu- 
ate students. A secondary objective was 
to obtain additional course prediction data. 
wherever possible. 

Both forms of Minnesota Psycho-Analogies 
were administered to 33 graduating seniors 
and 125 graduate students majoring in psy- 
chology at the University of Minnesota. The 
graduating senior sample consisted of volun- 
teers and represents only 25% of the total 
available population, whereas the graduate 
students were required to take the test and 
93% of them complied. There is some possi- 
bility, therefore, that the graduating seniors 
tested represented a positively biased sample 
from their population. Half of the total group 
took Form A first and the other half took 
Form B first, thereby producing a balanced 
experimental design to permit estimation of 
whatever practice effect may accrue from 
taking one form first. In addition to obtaining 
these normative data, Form A was admin- 
istered to one section of Vocational and Oc- 
cupational Psychology and a section of Intro- 
ductory Laboratory Psychology in order to 
obtain additional course prediction data. 

Table 1 presents the means and standard 
deviations of all groups who took Form A in 
the spring and summer of 1949, At this time 
only 50 of the graduate students had been 
tested. Almost the entire range from a chance 
score of 17.5 to the maximum score of 75 is 
utilized. Despite some overlap, which should 
be expected, the upper part of the range 


Table 1 


Means and Standard Deviations of Groups Who Took 
Form A of Minnesota Psycho-Analogies 


Group N M S.D. 
Graduate Student Majors 50 64.1 5.58 
Graduating Senior Majors 33 51.7 7.37 
Psychology 130 39 40.7 8.78 
Psychology 5 23 32.9 7.84 


302 Abraham S. Levine 


Table 2 


Means and Standard Deviations on Form A, Form B, and Forms A and B Combined, of Minnesota 
Psycho-Analogies for Four Groups of Students Majoring in Psychology 


Forms A and B 


Form A Form B Combined 
Group N M S.D. M S.D. M SD. 
Graduating Seniors 33 513. 737 502 715 101.9 1378 
First Year Graduate Students 50 SL 6.91 57.3 6.67 114.9 12.76 
Second Year Graduate Students 44 61.6 6.46 60.0 6.25 121.6 11.55 
Third Year Graduate Students 31 66.6 4.25 65.7 5.16 132.2 7.93 


is utilized by graduate student majors. 
This is followed in order by graduating senior 
majors, Vocational and Occupational Psy- 
chology, and Introductory Laboratory Psy- 


course. The graduate student sample was 
made up for the most part of graduate students 
who were more advanced on the average than 


- the more complete sample reported in Table 2. 
chology. It should be pointed out that Results on Forms A and B separately and 


Vocational and Occupational Psychology was combined are presented in Table 2. The data 
a course open to juniors, seniors and graduate for the total sample of 125 graduate students 
students who had at least nine quarter credits appear in this table. These graduate majors 
in psychology but who were not necessarily are broken down into first, second and third 
psychology majors; and Introductory Labora- year groups depending on how much graduate 
tory Psychology was an elementary sophomore work they had taken previously. The data 


x 
x 
x 
ix 
x x x 
XXX LGU xxx x opo xx x 
Grad. Seniors tt PSU TUA sp. i rs 
(N = 33) 0 123 456789 123456789 5 9 12 
30 ^ 40 a gi? seser8 d 
x 
Y x ¥ 
" x zE we xk 
i XX XXXXXXXXXX X x 
n Punt i— : ms 2 XXXXXXXXXXXXXXXXXX XXX // 
(N = 50) 0 123456789 123456789 123 1234 
45678 789 
30 40 50 9 123456 6 
x x 
x X x xx XX z 
Second Year 7 XXXXXX xxxx xxXX * x 
Grad. Students Fy : si XXXXXX x ga nbs 2x08 re 
(N= 44) 0 123 å 
; 30 Peon SS COEEEETED 123456789 123456789, 12? 
50 60 70 
x X E 
xxx XX 
Third Year XXX XXXXXX gx 
Grad. Students R z xxx XXXXXX* ^ 
(N-31) 0 1234567 c 3 
30 89 jt 23456789 123456789 123456789, 1? 
50 60 70 


Fic. 1. Distribution of scores on Form A of Minne: 


P ts 
majoring in psychology. je Psycho-Analogies for four groups of studen 


ach x — one student. 


Minnesota Psycho-Analogies 303 


indicate that the variability of scores decreases 
as central tendency increases in successive 
year groups; and that perhaps Minnesota 
Psycho-Analogies is most useful at the graduat- 
ing senior and first year graduate student 
levels, and least useful for advanced Ph.D. 
Candidates. "These general trends are illus- 
trated in Figure 1, which presents the distribu- 
tion of scores on Form A. The critical ratios 
of the differences between means of successive 
groups were computed for Forms A and B 
Combined. All of these critical ratios were 
Significant beyond the 1% level, indicating 
that statistically significant discriminations are 
made between successive levels of psychology 
majors. 

Norms were developed for the four categories 
of Psychology majors for each form separately 
and for both forms combined. These norms 
are presented in the manual designed to ac- 
Company Minnesota Psycho-Analogies. Since 
Courses of instruction and selection standards 
Vary at different institutions, norms should be 
developed for each institution separately be- 
ore using Minnesota Psycho-Analogies as a 
Service instrument. 

Available data tend largely to support the 
assumption that Forms A and B are equivalent 
9r samples of graduating seniors and graduate 
Students, However, there is some suggestion 
that Form B is slightly more difficult, but the 
difference is only one raw score point between 
means, Since these forms were constructed 
largely on the basis of item analysis data of a 
Sample selected from a more heterogeneous 
Population, equivalence will eventually have 
to be determined separately for different popu- 
ations, At any rate, the possible difference in 
difficulty even for the particular populations 
"épresented by the samples studied is of a 
rather small magnitude from a practical point 
of view, 

Correlation between Forms A and B for the 
origina] sample of graduating seniors and 
este Students was .89. For the total 
ie of graduate students alone the relia- 
Spe. estimate dropped to .78. Using the 
mat man-Brown formula, the reliability esti- 
stud, for the total test of 150 items for graduate 
in n ots alone is.88. There is some indication 
1e above coefficients that for a restricted 


Ta 
nge of talent such as graduate students, 


both forms of the test should be combined in 
order to provide a stable enough score for 
individual prediction or diagnosis. More prac- 
tical indicators of the stability of individual 
scores are obtained from the standard errors of 
measurement which are as follows: Form 
A=3.3, Form B 23.2, Forms A and B com- 
bined =4.6. 

Since the difference in means between the 
form administered first and the form adminis- 
tered second was not significant even at the 5% 
level, it may be concluded that there is no 
practice effect from taking one form first when 
there is no time limit for taking either form, 

The rather appreciable correlation of .72 was 
obtained between Miller Analogies, Form G, 
and the complete Minnesota Psycho-Analogies 
for the 96 graduate students on whom these 
test data were available. The magnitude of 
this correlation is partially attributable to the 
fact that the more advanced graduate stu- 
dents, because of the nature of the selection in- 
volved, tend as a group to be more ‘Miller- 
bright” as well as having more training in 
psychology. 

For the same sample of 96 graduate students, 
the mean raw score of 78.9 on Miller Analogies 
is about as relatively high in terms of the total 
possible score of 100 as the mean raw score of 
121.1 is on the total 150 items of Minnesota 
Psycho-Analogies. This comparison is de- 
fensible since both tests have four alternatives 
and the number right constitutes the score in 
both cases. This fact is pointed out since 
Miller Analogies, Form G, was developed 
especially for graduate students, and insofar as 
the principal use of Minnesota Psycho- 
Analogies will also be for graduate students, 
one of the main considerations is to have ade- 
quate ceiling for the upper ranges. Because 
items were selected for Minnesota Psycho- 
Analogies on the basis of an analysis of a lower 
mean level sample, there is some question as to 
adequacy of ceiling at the upper levels. It is 
interesting to note that the mean Miller score 
of 78.9 is almost at the 80th centile of graduate 
students in general. So it is also conceivable 
that the group of graduate students in the 
University of Minnesota’s Department of 
Psychology is somewhat above the mean of 
other graduate departments in Miller ability 
as well as in the kinds of special training which 


304 


Minnesota Psycho-Analogies is measuring, and 
that therefore the norms developed here may 
be rather high for graduate departments in 
general. 

Also significant is the comparison of standard 
deviations on both tests for the sample being 
considered. The standard deviation for Miller 
G is 8.98 and for the Psycho-Analogies, Forms 
A and B combined, is 12.59. Again, in view of 
the relative number of items, both tests may 
be regarded as having approximately equal 
variabilities for graduate students in psychol- 
ogy at the University of Minnesota. It is to 
be expected then that in those other depart- 
ments of psychology where the variability on 
the Miller is greater, the spread of Minnesota 
Psycho-Analogies scores will also be greater— 
particularly in view of the possibility that the 
special: informational content of Minnesota 
Psycho-Análogies may be unduly influenced by 
the subject matter content of courses in the 
Department at the University of Minnesota. 

Also similar are the shapes of the distribu- 
tions of the two tests. Both Miller and Min- 
nesota Psycho-Analogies test scores are nega- 
tively skewed. The form of these distributions 
may reflect to some extent the kind of selection 
that has taken place, and it is a plausible 
hypothesis that if graduate students in psy- 
chology at the University of Minnesota were 
less rigorously selected, then both distributions 
would assume a more normal form. 

A correlation of .69 was obtained between 
combined examination scores for Introductory 
Laboratory Psychology and scores on Form A 
of Minnesota Psycho-Analogies. For Psy- 
chology 130 an r of only .30 was obtained. The 
latter r is significantly lower than hitherto 
obtained on similar samples with preceding 
forms of the test. The lower r may be partially 
attributed to the more restricted range of the 
criterion, since only a two-hour final examina- 
tion was given instead of the usual one-hour 
mid-quarter and two-hour final. Upon query- 
ing the instructor for this course, another more 
interesting explanation emerged. It seems 
that in selecting items for the final examination 
an attempt was made to weed out items which 
appeared to be saturated with either general 
ability or previous background, since less time 
was available for testing than is ordinarily 
the case. 


Abraham S. Levine 


Conclusions 


It may be concluded on the basis of the data 
obtained in this project that a special analogies 
test is a useful predictor of achievement 1n 
psychology courses, particularly at the more 
advanced levels. That the special analogies 
test also functions as a good terminal evalua- 
tion instrument is indicated by the rise in 
mean scores with increased amounts of course 
work and with higher levels of attainment in 
psychology. 

Asa selection instrument, Minnesota Psycho- 
Analogies may be conceived of as a supplement 
to Miller Analogies. Thus Miller Analogies 
may be used as one of the criteria for admission 
to graduate work, and Minnesota Psycho- 
Analogies may be employed to further deter- 
mine competency to undertake advanced work 
in psychology. As such, both instruments 
will be used as successive hurdles, and the score 
on Minnesota Psycho-Analogies may be better 
interpreted in the light of Miller Analogies 
ability. Whether or not any student who 
attains a high score on Miller Analogies but 
who gets a low score on the special analogies 
should be excluded from a particular depart- 
ment would be a function of the available 
facilities at the time and the nature of the other 
relevant data on the individual. Theoretically» 
a student who gets a high score on Minnesota 
Psycho-Analogies should also do well on Millet 
Analogies. If this is not the case, it may be 
due to the defects inherent in analogy tes!* 
under timed conditions for certain kinds of indi- 
viduals, and it should provide the basis for dd 
administration of Miller Analogies, as is some 
times done. Minnesota Psycho-Analogies has 
incorporated certain administrative advantage? 
based on experience with the Miller. Thus: 
for example, there are two forms of tbe P 
Strument which make certain kinds of cheat 
ing more difficult. Also, when it is necessa 
to re-administer the test another form may 
be given, thereby obviating specific memo 
factors. The untimed nature of the test te" n 
also to reduce some of the sources of €9 
tamination. po- 

The basic rationale underlying the Pay nef 
logical analogies test may be extended to °% s 
fields where a similar need exists. hus 


P. 


Minnesota. Psycho-Analogies 


would be possible to develop similar instru- 
ments for biology, sociology, political science, 
economics, and so on. A still more useful 
application in terms of the magnitude of the 
selection task would be in the realm of medical 
aptitude tests. A special analogies test de- 
Signed to measure achievement in biology, 
chemistry, and physics, as well as general 


305 


ability may be developed for medical school 
selection purposes. Or it may be found more 
feasible to utilize such a test as a subtest in a 
more diversified instrument. At any rate, the 
data obtained in this project would tend to. 
indicate the feasibility of exploring the possible 
uses of special analogies tests in other fields. 


Received December 20, 1949, 


A Note on Norms for the Purdue Industrial Mathematics Test 
and the Adaptability Test 


Howard E. Page 
CNATra Staf, NAS, Pensacola 


The Management Engineering Division of ` 
the Naval Air Station, Pensacola, Florida is 
faced, from time to time, with the task 
of selecting from a large number of aviation 
tradesmen a few individuals for upgrading 
into positions of Planner and Estimator and 
Shop Planner. In the past, such selection has 
been made in terms of past experience in one 
of the aviation trades and successful and pro- 
gressive experience at the Journeyman level. 

Recently, personnel responsible for such 
selection have become interested in the use of 
psychological tests as an aid in such selection. 
Since no professionally trained psychologist is 
available on their staff, the writer has served 
in a consulting capacity on several occasions. 

A major difficulty has been the non-availabil- 
ity of adequately standardized trade tests with 
normative data applicable to the aviation trades. 
Lack of personnel and time has precluded the 
development of “tailor made” tests with the 
result that commercially available tests have 
been used and norms established for the local 
population. 

A recent administration of two tests—the 
Purdue Industrial Mathematics Test and the 


Adaptability Test by Tiffin and Lawshe— 
provided data on a population of 152 Aviation 
Tradesmen with a sufficiently large representa- 
tion from two trades to make feasible the pub- 
lication of such norms. 

Table 1 summarizes the background data 
on this population. It is to be noted that 
48 per cent of the total population have served 
an apprenticeship in a trade, and that 71 per 
cent have completed Trade School training. 
Fifty-seven per cent of the population has 
completed at least a high school education. 
This is a larger percentage than might have 
been expected. On the other hand, the group 
is a relatively young group in terms of age anc 
in number of years on the job. The range for 
years on the job was from two months to 2 
years. It must be concluded that personnel 
included in this group are well trained and 
experienced in their particular aviation trade. 

Table 2 presents the mean scores and stand- 
ard deviations for those trades where N was 
large enough to make this meaningful. AST 
trainees do somewhat better on the Industrial 
Mathematics test and the Vocational Trade 
School students do less well than does the 


Table 1 


Descriptive Data for Population Tested 


E^ bd 
erve No. Who Y i 
Apprentice Had Trade attest’ — No yrs ayeni 

ship Sch. Trng. HS Educa. on Job us 
Aviation Mech. Gen. 51 9 if 
Metalsmith 42 21 32 26 6 F 
Machinist 17 15 10 5 7 2 
Electrician 16 13 11 33 
Aircraft Engine Mech. 12 7 11 i s S 
Instrument Mechanic 6 1 4 ; » 
Radio Mechanic 3 1 : 9 
Electroplater 3 i : H s 
Unclassified 2 : : 5 j 

0 2 
Total 152 68 
108 86 p si 


306 


A 


Norms for Purdue Industrial Mathematics Test and Adaptability Test 307 


Table 2 . 
Mean Grade and Standard Deviation by Trade 


Adaptability Test Industrial Math Test 


Trade N Mean S.D. Mean S.D. 
Aviation Mech. Gen. - | 51 16.8 $9 17.4 54 
Metalsmith 42 15.9 4.6 17.0 5.5 
Machinist 17 16.5 7.2 15.6 6.6 
Electrician 16 15.3 54 15:3 5.7 
Aircraft Eng. Mech. 12 14.2 3.6 13.7 3.9 

16.1 5.6 16.4 5.6 


Total 152 


Population reported. This is as one would 
predict in terms of the higher educational level 
of the ASTP students as compared with the 
Aviation Tradesmen and the latter’s greater 
on the job” experience as compared with 
Vocational Trade School students. On the 
Adaptability Test the mean scores for the 
Aviation trades are comparable to those pub- 
lished by the authors for Naval Electrical 
Trainees, Employees of a Piston Ring Manu- 


facturing Company and Total Group. Avia- 
tion Tradesmen surpass Female Applicants 
and fail to do as well as Employed Clerical 
Workers and Purdue Seniors. 

Table 3 presents the norms for the Adapta- 
bility Test. Data to the left of the vertical 
line are reproduced from Examiners Manual 
for the Adaptability Test by Joseph Tiffin and 
C. H. Lawshe and published by Science Re- 
search Associates, 228 South Wabash Avenue, 


Table 3 
Norms on the Adaptability Test 
s E & d 
E: z E: Es] S 9 , z 8 
o E: g = 5 $.5 au s g sg 
3 €t s EET S n See ZI E E HE 
E E e RE 35 8 See 653 E AE. 
E E E bs fS E Fee S28 | 3 d BE 
à 5 à AE iz & uHÉRO <0 Š E gË 
100 i 32 35 33 34 30 35 33 
99 ads " 29 32 35 30 30 27 31 25 
98 2.05 2 27 30 34 29 28 26 29 28 
25 1.64 20 25 28 32 26 26 25 27 25 
90 1.28 18 24 26 31 23 23 2 24 3 
80 84 15 2 24 29 20 20 20 2 21 
70 52 13 20 22 28 18 18 18 20 19 
un 25 12 19 21 27 16 16 17 18 is 
30 .00 11 17 20 26 14 15 16 7 15 
aD —35 9 16 18 25 1 12 15 15 i5 
30 52 8 15 17 24 9 10 13 14 13 
2 ~.84 6 13 15 23 j 8 12 12 i 
10 — —128 4 11 13 21 4 5 10 9 9 
5 —1.64 2 9 11 20 3 4 8 5 7 
? — -205 7 10 18 1 1 6 4 5 
1 ~2,32 6 8 17 5 3 3 
Total N = 612 640 86 43 704 2175 42 51 152 


308 Howard E. Page 
Table 4 
Norms on Purdue Industrial Mathematics Test (Form A or B) 
Vocational Total 
Percentile Std. ASTP Trade Sch. Metal- Aviation 
Score Score Trainees Students smith Tradesmen 
100 3.00 32 28 33 35 33 
99 2.32 31 27 30 31 30 
98 2.05 30 26 28 29 28 
95 1.64 28 23 26 27 26 
90 1.28 27 20 24 24 24 
80 84 25 18 22 22 21 
70 52 23 16 20 20 19 
60 25 21 14 18 19 18 
50 .00 20 13 17 17 16 
40 —.25 18 12 16 16 15 
30 —.52 17 10 14 14 13 
20 — 84 16 9 12 13 12 | 
10 —1.28 14 7 10 10 9 * 
5 —164 13 6 8 8 7 
2 —2.05 11 5 6 6 5 
1 —2.32 10 4. 4 4 3 
Total N = 125 188 42 51 152 


Chicago, Illinois. To the right of the vertical 
line are comparable norms for Metalsmith, 
Aviation Mechanic General and the total 
Aviation Tradesmen tested at Pensacola. 

Table 4 presents the norms for the Purdue 
Industrial Mathematics Test (Form A or B). 
Data to the left of the vertical line are repro- 
duced from Preliminary Manual—The Purdue 
Industrial Mathematics Test by C. H. Lawshe, 
Jr. and Dennis H. Price—and distributed by 
the Division of Applied Psychology, Purdue 
University, Lafayette, Indiana. To the right 
of the vertical line are the additional norms 
developed at Pensacola. 


The correlation between the Adaptability 
Test and the Industrial Mathematics Test for 
the total population (N = 152) proved to be T 
This seems very high for the two tests, where 
one is supposedly measuring arithmetic ability 
while the other attempts to measure genera 
aptitude. An inspection of the items in the 
two tests, however, shows considerable overlap 
which would account for the high relationship 
found. 

The reliability of the two tests calculated bY 
the Kuder-Richardson short formula was fou? 
to be .74 for the Adaptability Test, and 16 for 
the Industrial Mathematics Test. 


Received December 5, 1949, 


i 


undc ^ 


P 


Measurement of a Complex Psychomotor Performance 


by Means of 


a Printed Test 


Lloyd S. Nesberg and Karl U. Smith 


University 


A practical aspect of measurement of human 
capacities and traits is the simplification of 
present methods in appraisal of psychomotor 
Performance. One possible approach to sim- 
plified design of the psychomotor test is the 
development of printed examinations which 
Will scale sensory-motor capacity in the same 
Seneral manner as the apparatus test. In the 
Present study, developmental research on a 
Printed test has been carried out in order to 
measure the performance involved in a judg- 
mental reaction time test, the Vector Complex 
Reactometer. This test has been developed 
lor determination of some aspects of pilot 
aptitude, 


Methods 


1. The Vector Complex Reactometer. In this 
lest (Figure 1) the subject’s reaction -time is 
Measured in turning a series of switches relative 
to the changing position and direction of three 
lights, the pattern of which is altered during a 
Predefined sequence. The subject, in turning 
a given switch on the response board, selects a 
Correct group of switches in terms of the rela- 
live direction of a red light with respect to a 
&reen light and, thereafter, a particular correct 
Switch within the group in terms of the position 
ofa white light. The three lights are presented 
On the stimulus panel automatically, and the 
Subject’s response immediately prepares the 
*pparatus for the presentation of the next 
Stimulus pattern. 

The test under consideration here is designed 
to present forty different light patterns in a 
Sequence, which may be repeated after the 
Séquence is completed. It may be adminis- 
tered with a definite time limit, in which case 
the number of reactions in a given test period 
are scored, or the score may be defined as the 
time required to perform a given number of 
reactions, Tm the research conducted, the test 
Was scored in terms of the total number of 


of Wisconsin 


correct switches turned in a four-minute test 
period. 

2. The Motor Decision Test. In designing a 
printed form of this test (Figure 2) the same 
general principles of stimulus presentation and 
response involved in the Reactometer have 
been incorporated into each test item. Refer- 
ence to the sample item in Figure 2 will show 
how the test was designed. Instead of the 
four groups of five white lights in the stimulus 
panel of the apparatus test, the printed item 
presents four groups of circles each arranged 
as in the performance test. The critical 
stimulus among these groups of white circles 
is indicated by the black filled circle, as shown 
in the upper right hand group in the sample 
item. In the printed item, triangles and 
squares are substituted for the red and green 
lights of the apparatus test, and a black triangle 
and a black square constitute the critical 
stimuli for guiding the response. 

In the printed test, the subject responds by 
checking one of twenty-five inverted "s. 
These inverted T's represent the switches in 
the performance test, and are grouped in banks 
of five as in the performance test. The correct 
bank of switches is indicated by the relative 
position of the black triangle with respect to 
the black circle. The particular correct switch 
within this bank is indicated by the position of 
the black circle. 

The Motor Decision Test reproduces all 
forty stimulus configurations of the apparatus 
test. Some of these forty items are repeated 
in the test to give a total of 108 items, The 
completed form of the test consists of 12 pages 
of test items, with 9 items on each page. The 
score on this is the number of correct reactions 
minus the number of items marked incorrectly, 


Experimental Results 


The treatment of experimental results will 
be discussed in the following order: (1) the 
309 


310 Lloyd S. Nesberg and Karl U. Smith 


Fic. 1. The Vector Complex Reactometer. This device. 
Medicine, San Antonio, Texas, is manufactured by the Vector 


: jation 
, designed by Dr. Jack Buel, School af A gr 
Manufacturing Co., Houston, Texas. 


ment was obtained for research through the cooperation of Dr. Buel. 


Fic. 2. Design of a sample item in the 
Motor Decision Test. 


standardization of the Motor Decision isn 
(2) correlations and interrelations of the Len 
formance and printed tests; (3) the t differ 
and significance of transfer; and (4) sex 
ences in performance. sion 
1. The Standardisation of the Motor con 
Test. Two preliminary forms of me ate 
Decision Test were designed and uei up: 
before a final copy of the test was draw two 
On the basis of data obtained with d our 
forms of the test as well as with the ap ab- 
minute form, the following facts were stab" 
lished. The four-minute test interval : the 
lished for this test length allowed none (terns 
subjects to solve all of the stimulus Dligible 
Memory and fatigue effects were neg l 
Analysis of the location of errors does 


not rev 
: dee B srrors 
any differences in item difficulty. E 


Measurement of a Psychomotor Performance 


311 


Table 1 5 


A Quantitative Summary of Performance on the Printed and Apparatus Tests 


Sex Range Mean S.D. 

Group I Performance Test 25 Males 27-99 75.3 19.39 i 
(50 subjects) 25 Females 37-99 74.3 15.47 
Group II Performance Test 25 Males 58-135 85.8 13,51 
(50 subjects) 25 Females 58-105 82.3 13.55 
Group I Paper-Pencil Test 25 Males 28-98 61.6 14.46 
(50 subjects) 25 Females 27-83 542 14.57 
Group II Paper-Pencil Test 25 Males 37-100 65.7 17.27 
25 Females 44-91 66.2 12.72 


(50 subjects) 


Tesponse to specific test items are not per- 
Sistent. Total errors are extremely small, 
hence the number of items marked represents 
the major factor in the scoring. 

Distributions of scores on both performance 
and printed tests are approximately normal. 
A detailed quantitative description of data is 
Presented in Table 1. This table gives the 
means, range and standard deviations in two 
groups of subjects for both sexes and all test 
Sequences. Group I was given the perfor- 
mance test first and, 48 hours later, the 
Printed test. Group II was given the printed 
test first, The subjects were college students. 
There are no great differences in the range of 
Scores on the printed and performance tests. 

he means are consistently lower on the paper 
and pencil form regardless of its position in the 
testing order. Males as a whole are more 
Variable in test performance than females. 
There is no consistent trend in variability with 
Tegard to test sequence. 

Reliability. The test-retest reliability of the 
Vector Complex Reactometer, computed from 
the scores of a separate group of 23 subjects, 
1S +.86. This is the reliability found when a 
one-day period separated the two administra- 
tions. The test-retest reliability of the Motor 

€cision Test, based on the performance of 53 
Subjects and a temporal separation of 7 days, 
'S +.83. Data presented below will show that 
Correlation between the apparatus and printed 
tests approaches the reliability of each, as 
Just described. 

2. Correlation between the Performance and 

Tinted Tests. Correlations between the two 


tests were calculated for males and females 
separately and for the two conditions of test 
sequence used. These data are given in Table 
2. All values given are significant at the 1 per 
cent level of confidence. When the correlation 
values given are transformed into Fisher's Z,! 
and the standard error of the Z's computed, the 
differences are not statistically significant. It 
is therefore reasonable to suggest that for the 
conditions given, the Motor Decision Test 
displays a high level of duplication of measure- 
ment of the factors involved in performance 
on the Reactometer. 

3. Transference of Response between the Two 
Tests. The design of the experiment permitted 
determination of the degree to which prior per- 
formance on the printed test affected scores 
on the performance test and vice versa. This 


Table 2 


Correlations between the Performance and 
Printed Tests * 


Groups " 
Composite for females +0.70 
Composite for males +0.69 
Females: performance test first +0.82 
Males: performance test first +0.79 
Females: performance test second +0.84 
Males: performance test second +0.64 
Composite: performance test first +0.79 
Composite: performance test second +077 


* All values significant at the 1% level of confidence. 


Erg. 


= 1.5313 log + = 


312 


transference may be attributed to general situa- 
tional learning occurring in one test situation 
which is carried over to the second test, or to 
specific skill acquired in one test situation which 
is applicable to the second test. The amount 
of transference of response or generalization is 
indicated by the change in performance on one 
test attributable specifically to the fact of prior 
performance on the other test. When all 
subjects are considered together, a statistically 
significant increment in performance on the 
printed test is found when this test is taken 
after the performance test. The level of con- 
fidence of the difference in scores on the per- 
formance test when taken after the printed 
test and when taken without such previous 
testing experience also exceeds the one per cent 
level. Analysis of variance of the transfer 
data confirms these general statements and 
indicates, furthermore, that there is no signifi- 
cant effect of test sequence on transfer. This 
analysis discloses, however, that there are 
certain inherent differences in the measured 
characteristics of the performance and printed 
tests. 

4. Sex Differences in Performance on the Two 
Tests. Values of t for differences in perfor- 
mance for males and females were computed 
for all data on each test and also for data 
separately obtained on each test with respect 
to test sequence. None of the values was 
significant, which confirms the hypothesis that 
the effect of the sex characteristic on these two 
tests is zero. 


Summary 


A printed test, designated the Motor De- 
cision Test, has been designed to duplicate a 
complex reaction time apparatus test which 
has been manufactured for study of aircraft- 
pilot aptitude. The apparatus test has been 
given the name of Vector Complex Re- 
actometer. 

Utilizing an item design made up of differ- 
ently shaded forms which simulate light posi- 


Lloyd S. Nesberg and Karl U. Smith 


tions and switch arrangements in the apparatus 
test, it has been possible to duplicate in the 
printed test all combinations of stimuli pro- 
vided in the apparatus test. Comparisons of 
the two tests are made in terms of range of 
test scores, relative reliability, interrelation of 
test scores, and degree of generalization of per- 
formance from one test to the other. 

Results show that the two tests have approxi- 
mately equal test-retest consistency. The 
reliability of the apparatus test is +.86, that 
for the printed test +-.83. These values are 
only slightly higher than those found for the 
intercorrelation between the two tests. For 
different groups of subjects, the intercorrela- 
tion values typically vary from 4-.70 to +.80. 

Means of scores on the printed test, €X- 
pressed as total number of items answered 
correctly, are usually somewhat lower than 
those on the apparatus test. In the latter 
case, scores are expressed as the total number 
of effective responses made in a standard time. 
The distribution and variability of scores for 
the two tests are not significantly different. 

An analysis has been made of the degree tO 
which performance on one test may affec 
scores made on the other. A significant posi 
tive transfer effect is found for both tests In 
terms of the effect of the prior administration 
of one on the test scores found in the later 
performance on the other. Some significant 
minor differences in the measured character 
istics of each test are found through such ovet” 
all analysis of the tests when administeret 
successively. 

In general, it has been concluded that the 
printed Motor Decision Test duplicates gk 
tensively results obtained with the complicate 
apparatus test, and that such a printed tes i 
could be used safely as a substitute for OF Hm 
conjunction with the apparatus test in sgen 
ing procedures in which complex reaction tim 
may be of practical significance. 


Received January 6, 1950. 


Visual Skill and Performance in a Meat Packing Plant * 


F. Nowell Jones 
Unicersity of California at Los Angeles 


and 


Charlotte Jean Smith 
United States Spring and Bumper Company 


To the best of the writers’ knowledge, no 
work on the testing of packing house employees 
has ever been reported. It would seem that 
there would be considerable possibility for the 
use of selection devices in this industry, since 
many of the jobs require a high degree of skill, 
and involve some hazard of injury, especially 
Cuts. 


Preliminary Study 


The work on visual skill grew out of a pre- 
liminary study of a casings inspection depart- 
ment. Here the job required size grading and 
inspection by visual control, and results were 
negative when production was compared with 
the various Ortho-Rater measures, despite the 
fact that the criterion of production, output for 
two successive weeks, had a reliability of .96." 
However, N was only 17, the total number of 
workers in the department, and so it seemed 
desirable to extend the study to other jobs 
in the plant. 


The Present Study 


The Jobs. The three jobs selected for study 
Were wiener skinning, bacon slice and wrap, 
and shipping cooler work. Wiener skinning is 
the procedure of removing the cellophane 


This article is based on part of the material sub- 
mitted by the junior author in partial fulfillment of the 
requirements for the M.A. degree at the University o 

Wisconsin, We wish to thank the Bausch and Lomb 
Ptical Company for making available the Ortho- 
ater used in this study, and to thank Mr. Harold 
aeke, superintendent, and Dr. John M. McGinnes, 

Psychologist, of Oscar Mayer and Company, d 

fac consin, for respectively permitting us to use the 

Wlliies of the plant, and assisting us in obtaining 
ects and criterion scores. : 

R he Wonderlic Personnel Test and the, Minnesota 
ate of Manipulation Test were also administered to 
1S group. The correlations with the criterion E 

aora dedic, —.31; Minnesota, placing, —.30; Mi 

orb turning, — 39. These tests are apparently worthy 
further study. 


“casings” from “skinless” frankfurters or 
wieners after smoking. This involves the use 
of a moderately sharp knife to cut the links 
apart, and to start the tearing of the cello- 
phane. The skin is then removed by a spiral 
tearing action. Study of this job had been 
completed by the time study department, and 
there was considerable interest in improving 
production. The cellophane was quite difficult 
to see, being transparent, and it is possible that 
the job is performed best by “feel.” 

In the bacon slice and wrap operation we 
were concerned with the persons who lifted 
sliced bacon from a conveyer belt, weighed it 
into pound lots, and wrapped each lot in paper. 
Each operation involved visual control. 

The shipping cooler jobs were of the laboring 
type, in that the main work was moving stock 
to fill orders. It would not be expected that 
close visual control would be necessary here. 

Criteria. In the wiener skinning depart- 
ment it was possible to obtain two independent 
ratings of each employee, one by the foreman 
and one by the time study man who had been 
regularly assigned to this department. They 
rated each worker on a percentage scale, with 
100 indicated as the norm. Each rater made 
two ratings, a month apart. Unfortunately, 
the second rating by the foreman disappeared 
in the mails, and was therefore irretrievably 
lost. The two ratings by the time study man 
correlated '.94, and the average of his two 
ratings correlated .81 with the foreman’s rating. 

In the bacon slice and wrap department the 
criterion consisted of the foreman’s rating on a 
5-point scale. Asa matter of fact, the 1 posi- 
tion was not used, on the ground that all such 
workers had been transferred out. A second 
rating was made by the same foreman some 
weeks later, and the two ratings correlated .84. 
Considering the rather short scale used, this 
indicates a reasonably high criterion reliability. 


313 


314 


F. Nowell Jones and Charlotle Jean. Smith 


Table 1 


Correlations Between Production and Criterion 


Correlations* 


Bacon Slicers 


Test Shipping 

———— Wiener Cooler and 

Ortho-Rater: Skinners Employces Wrappers 
Far Vertical Phoria 32 = 13 09 
Far Lateral Phoria 31 —.12 —.29 
Near Vertical Phoria .02 —.06 .08 
Near Lateral Phoria —.03 .06 —.03 
Far Acuity, Both Eyes 37 .09 —.09 
Far Acuity, Worse Eye —.19 24 .00 
Near Acuity, Both Eyes 05 19 07 
Near Acuity, Worse Eye —.18 23 bk! 
Depth Perception 20 .03 03 
Color Perception —.10 —02 O04 


* None of these correlations is significant; in no case does r/o, approach 2.58 


wiener skinners. 


The criterion in the shipping cooler was not 
at all satisfactory. The foreman's rating was 
used as in the case of bacon slice and wrap, but 
only the 3 middle positions on the scale were 
used, and the correlation between the first 
rating and a rerating was only .45. 

Subjects. In bacon slice and wrap and 
wiener skinning the employees were almost 
entirely female. A further restriction in 
bacon slice and wrap was the hiring of only 
young workers. In the shipping cooler, the 
employees were predominantly male. The 
populations were as follows: bacon slice and 


, and ^s show insignificant 7's for 


wrap, 47; wiener skinning, 26; shipping 
cooler, 66. 

Testing. Testing with the Ortho-Rater was 
carried out in each department. Taking the 
test was voluntary, but as a matter of fact there 
were no "holdouts." If glasses were custom- 
arily worn on the job, they were worn for the 
test. Testing was on company time. 


Results 


Table 1 shows the correlation between the 
criterion in each department and the various 
Scales on the Ortho-Rater. None is sign!” 


Table 2 


A Comparison of the Means and Standard Deviations for the Packing House Group and the OSRD Group 
e 


Packing House 


CL RE OSRD 
N = 139 [Soa 
Mean e wa = 234 z 
Far Vertical Phoria 5.43 z 
Far Lateral Phoria 7.59 ms 344 a> 
Far Acuity, Both Eyes 9.45 249 7.19 1.97 
Far Acuity, Worse Eye 8.15 2. 46 10.96 1.37 
Depth Perception 2.74 Ds 9.58 1.68 
Color Perception 4.30 1 e 4.39 3.00 
Near Acuity, Both Eyes 9.50 167 4.81 1.01 
Near Acuity, Worse Eye 8.50 2 11.91 1.10 
Near Vertical Phoria FEM E 10.50 1.64 
Near Lateral Phoria 8.57 rec 4.30 n. 
B s 7.97 2. 


Visual Skill and Performance in a Meat Packing Plant 315 


cant at the 5% level, when tested by the 
Standard error. The correlations for wiener 
skinning were also subjected to a ¢ test, and 
were found not to be reliable. Since a correla- 
tion between a phoria test and a criterion would 
be hard to interpret if the phoria scores were 
taken straight through from high to low, the 
"5 reported here are based on scores as devia- 
ions from the norm. Correlations calculated 
Straight through show no significant relation- 
ship cither. In addition, each scatter diagram 
was carefully inspected, and in some cases, 
where it appeared useful, average performance 
for each score on a given Ortho-Rater scale was 
plotted, to determine whether or not the cor- 
relation had failed to reveal a cut-off point, 
or other relationship. In no case was this true. 

It is apparent that, within the limits of 
Criterion validity and reliability, application of 
the Ortho-Rater would, on the face of the 
matter, add nothing to the selection of workers 
for the jobs under consideration. It is possible 
that some restriction of range of Ortho-Rater 
Scores might have lowered the relationships 
found and so we have compared our means 
and standard deviations with those reported 


in the OSRD study of reliability? The means 
are quite comparable, and our standard devia- 
tions are, except for depth perception, larger. 
The means and standard deviations are given 
in Table 2. It is difficult to escape the con- 
clusion that the Ortho-Rater would not be 
useful in this company, at least as far as the 
jobs under consideration are concerned. 


Summary 


1. The Ortho-Rater was used to measure 
visual efficiency of employees in three packing 
house departments: wiener skinning, bacon 
slice and wrap, and the shipping cooler. In 
all, 139 employees were tested. 

2. No significant correlation between Ortho- 
Rater scores and efficiency, as determined by 
foremen’s ratings (and in the wiener skinning 
department, by foreman’s and time study 
man’s ratings), was found. 


Received November 7, 1949. 


2 Adams, D. K., Beir, D. C., Imus, H. A. A test- 
retest reliability study of the Bausch and Lomb Ortho-Rater 
with Naval personnel. Office of Scientific Research and 
Development, No. 3969, August 1, 1944. 


The Myth of Chronological Age * 


Austin S. Edwards 
University of Georgia 


In recent years there has been increasing 
study of the problems of age and aging (3, 4, 
5,7). Increasing numbers of the aged appear 
in our population and old age counseling has 
become a definite kind of psychological work. 
Many articles have appeared, but except for 
occasional reference, no accurate information 
is given concerning sensescents who are reason- 
ably healthy as compared with those who are 
seriously deteriorated or disabled (2). Gen- 
erally no distinction is made. See, for ex- 
ample, the excellent article on chronological 
age and quality of literary output (5). 

This paper is concerned with the question 
of the differences between senescence and 
senility. It suggests that here, as elsewhere, 
chronological age is an inaccurate indication 
of ability and competence. To what extent do 
the senescents differ from the seniles? Are 
many senescents still quite as capable as so- 
called younger individuals? In what respects 
are healthy and relatively uninjured senescents 
as capable or more so than younger people? 
All of these questions have had insufficient 
answers. 

It is the purpose of this paper to consider 
one very important aspect of ability, namely, 
body, hand, and arm steadiness and to give 
some accurate information as to the difference 
in these respects between senescents who are 
not known to have any serious physical handi- 
caps, disease history, etc., and seniles, so 
diagnosed at a state mental hospital, and to 
compare both with the average for younger 
people. 

Body Sway. A limited number of Ss were 
measured for body sway on two different 
occasions with no special reference to disease 
history. In both cases the senescent individ- 
uals were practically as steady as the younger 
Ss, whose ages ranged from 10 to 29. In the 
first study, which gave age differences, the Ss 
aged 50-69, twenty-four in number, showed 


* Thanks are due Miss Anne Gilbert fo i i 
jack EA r assistance in 


no greater body sway than did the younger Ss 
the mean and the median giving contradictory 
results. In the later study with 23 55, aged 
50-70, the difference, if any, was in favor of 
the older Ss. This was true with eyes open; 
the results with eyes closed showed little dif- 
ference, although some of the older Ss showed 
somewhat more sway than the younger. 

With insufficient numbers of cases, it does 
not appear that Ss from age 50 to 70, who have 
not suffered from disease or accident to any 
great extent, have necessarily more body sway 
than do younger Ss. In fact, taking individua 
cases, many of the older Ss have considerably 
less body sway than do many of the younger. 

Hand and Arm Sleadiness. Finger Move 
ments. A study has been made in connection 
with finger movements. The writer’s finger 
tromometer was used with standard procedure 
(T). The tromometer measures finger move. 
ments in three dimensions of space. Time a 
measurement is 30 seconds. It is believed that 
finger tremor (gross finger movements) 15 ? 
decidedly sensitive indicator. "The mean finge! 
tremor of 1000 Ss (aged 16-35) is 35.3 mm» 
S.D. 20.25. Examination of 65 senescents 
aged 60-85 (average chronological age, 10.4), 
who had no serious amounts of disease pistone 
and who may be considered as reasona i 
healthy individuals, had a mean finger tremo 
of 42 mm., S.D. 24.5. The difference between 
the means of these and of the 1000 younger 
subjects showed an increase of finger seni 
ments for the older group of only 19 pet ntt 
with a critical ratio of 2.21 (significant at ë 
per cent level). Some of the oldest 55 wer 
actually the steadiest. 8 

In contrast to this are the results of th 
measurements of the senile group at 2 sta » 
mental hospital. These 89 cases were m 
nosed in the case histories by the hospital gh g 
as senile. Their ages were 54-90 (aver 
chronological age, 67.7). Although the we 
age chronological age of the seniles was ® dae 
three years less than the average of the gen 


316 


—_ 


m ~ 


The Myth of Chronological Age 


cents, the average finger tremor was 133.3 mm., 
S.D. 66.8, which is more than three times as 
great as that of the former. The difference 
between the means of the seniles and the 1000 
younger Ss gave a critical ratio of 12.8, and that 
between the senescents and the seniles, a 
critical ratio of 12.0. 

Thus we find a small but statistically reliable 
difference between the senescents and the 
younger Ss, although the senescents cannot be 
said to have abnormal finger tremor or an 
amount that might be expected to have much, 
if any, practical significance. On the other 
hand, the seniles form a distinctly different 
group as compared with either of the others, 
and have an amount of finger and arm move- 
ment (involuntary, uncontrolled movement) 
that would be expected to have serious con- 
sequences so far as skilled work involving fine 
muscular control and steadiness is concerned. 

Men vs. Women. The differences between 
men and women are not remarkable except 
perhaps on one point. The senile men had 
increased finger tremor in comparison with 
women in almost the same relative amount as 
occurs with our so-called younger normal 
groups. On the other hand, the senescent 
Women showed relatively more increase than 
did the senescent men. The average for 
younger men is 39.83, S.D. 21.17, and for the 
senescent men the average is 42.67, S.D. 26.2. 
The norm for younger women is 30.33, S.D. 
18.02, and the average for the senescent women 
Was disproportionately higher, namely, 41.27, 
S.D. 26.4, If this difference between senescent 
men and women is truly indicative of other 
Possible sex differences for the aged, it may be 
Of no little scientific and practical significance. 


Discussion 


The competence of so-called aged people, 
whatever that may mean, is to be understood 
as an individual matter. There are many 
individuals above 45 who are not competent 
to do good skilled work. On the other hand, 
there are skilled workers of much greater age 
Who are entirely competent and perhaps better 

oth in quality and quantity of production 
than many younger workers. The personnel 
Problem is not one of merely grouping all indi- 


317 


viduals above some certain age and calling 
them incompetent. Our study gives emphasis 
to the need for discovering what old people are 
entirely competent, and what aged people have 
been made incompetent not because of chrono- 
logical age but because of sickness, injury, or 
deterioration. It is reasonable to expect that 
many older people need more frequent rest 
periods than do younger workers. It is even 
more important to recognize the fact that 
many older workers are of the highest com- 
petence and are often found to be superior in 
judgment, foresight, carefulness, freedom from 
accidents, and trustworthiness. For our study 
old age has meant 60-85, with practically as 
good hand steadiness as that of students aged 
16-35. What does old age mean as it is 
commonly used in business and industry? Is 
it much more than a superstition held by those 
who still continue in the fossilized thinking of 
by-gone ages? 


Summary 


1. So far as we have data on body sway, it 
appears that many people whose ages reach at 
least 70 are no less steady standing in the erect 
position than are many younger people. A 
considerable number of senescents are con- 
siderably more steady than the younger Ss. 

2. In our experiments upon finger tremor it is 
clear that senescents do not differ greatly from 
the average of 1000 college students. It is 
clearly evident that the average finger tremor 
for senile patients is more than three times as 
great as that of presumably healthy senescents 
whose average age was a little greater than that 
of the seniles (70.4 as compared with 67.7). 

3. Senescent women were found to have dis- 
proportionately greater increase of finger 
tremor than senescent men. Although this 
was not great, it suggests a problem worthy of 
further research to discover whether greater 
deterioration and inability appear in aging 
women than in aging men. 

4. It is suggested that competence for various 
kinds of work can only very inadequately be 
judged in terms of any such rough and in- 
accurate indication as that given by chrono- 
logical age; and that abilities of men and 
women should be decided by means of such 


318 Austin S. Edwards 


accurate methods as are here suggested in one life. Stanford University: Stanford University 
i Press, 1945. 

area of human behavior. 4. Kuhlen, R.G. Age differences in personality during 

Received November 10, 1949. adult years. Psychol. Bull., 1945, 42, 333-358. 

5. Lehman, H. C., and Heidler, J. B. Chronological 


age vs. quality and literary output. Amer. J 
References Psychol., 1949, 62, 75-89. 


. Pressey, S. L., Janney, J. E., and Kuhlen, R. G- 
Life: A psychological survey. New York: Harper 


a 


1. Edwards, A. S. The finger tromometer. Amer. J. 
Psychol., 1946, 59, 273-283. & Bros., 1939. 


2. Geriatrics. Vols. 1-4, 1946-49. . Ruch, F. L. Adult learning. Psychol. Bull., 1933 
3. Kaplan, O. J. (Editor). Mental disorders in later 30, 387-414. 


~ 


Reading Ease of Commonly Used Tests * 
Ralph H. Johnson 


Veterans Administration, Minneapolis 


and 
Guy L. Bond 


University of Minnesola 


The authors of this paper are concerned with 

the use of vocational tests requiring reading 
Skills in the expanding post war counseling 
Programs in personnel selection, school guid- 
ance programs, and the vocational counseling 
of veterans. 
. Numerous intelligence tests requiring read- 
ing skills are being administered to general 
Population groups, veteran and non-veteran, 
and similar tests continue to be administered 
to students at all grade levels in survey testing 
for individual guidance purposes. Recent de- 
velopments in the area of vocational counseling 
have also been characterized by the increased 
use of the Kuder Occupational Preference 
Record at the junior high school level and by 
More frequent use of the Strong Vocational 
Interest Blank at the senior high school level, 
plus the extensive use of both tests in the 
Vocational counseling of World War II vet- 
erans. It appears that some clarification is 
needed concerning the reading levels of the 
Seneral population and student groups and 
the extent to which vocational tests which 
Tequire reading skills match these reading 
levels, 

This article approaches the problem by indi- 
cating the general types and prevalence of 
reading limitations among the academic and 
Seneral population. Second, this article at- 
tempts to arrive at a relative determination of 
the readability level of tests commonly used 
Mn counseling and group testing situations. 

hird, this article suggests observations and 

. Conclusions that may have implications of im- 
Portance concerning the use of tests requiring 
Feading abilities in vocational counseling 
Situations, 

*Th 
autho: 
Veter, 


are the views of the 


* opinions expressed herein 
ii d as representing the 


d: and are not to be construe! 
ns Administration. 


Clients assigned to tests requiring the use of 
reading skills may be limited in their compre- 
hension of written materials because their in- 
tellectual capacity prevents them from com- 
prehending material written beyond the ninth 
and tenth grade levels; or, the clients may 
have reading disabilities which precluded their 
reading growth from developing at the same 
pace as their intellectual development. A 
client in the latter category is considered a 
reading case if there is a significant degree of 
difference between his mental age and his 
reading age. At the junior high school level, a 
student is considered to be a reading case if his 
reading age is two or more years below his 
mental age (20). Mental age in the above 
definition is considered to be the mental age 
derived from an individual test of mental 
ability such as the Stanford-Binet or Wechsler- 
Bellevue. 

Table 1 


Reading Level of Adults with IQ's Below 100 


Reading 
IQ MA Grades 
75 11.3 5.4 
80 12.0 6.2 
85 12.9 7.0 
90 13.6 74 
95 144 8.6 
100 15.0 9 


In the general population, the prevalence of 
reading limitation ascribed to lack of intel- 
lectual capacity is reflected by Table 1. Table 
1 is derived from Gates' (13) table for trans- 
lating age scores into grade scores. Bond (5) 
states that normal reading grade at maturity, 
as indicated by Table 1, may be considerably 
increased by exposure to good teaching. J 


319 


320 


Other indications of ihe reading level of the 
general population are the following: (a) Un- 
published results of Army and Navy surveys 
which indicate mean reading grade levels of 
between grades 8 and 10 for veterans of World 
War II; (b) Lorge and Blau (19) found an 
average reading grade level of 9.2 among 242 
tested adult WPA subjects; (c) Census data 
for 1940 give grade 8.4 as the average school 
grade completed by our population. This is of 
significance when considered with Witty's (24) 
conclusion that the average attainment in 
reading of elementary school graduates har- 
monizes closely with grade expectancy. 

An indication of the prevalence of reading 
disability cases is reported in a study by Gray 
(14), who states that one out of five junior high 
School students in Chicago had a reading dis- 
ability or was considered a reading case. An- 
other study by Monroe (20) indicates that four 
out of five reading cases are boys. 

Assuming that these studies generally repre- 
sent an approximate indication of the reading 
limitations of the general population, it is 
encumbent upon counselors to exercise care in 
the selection and interpretation of tests which 
require varying degrees of reading skills. 
Since test results may vary depending upon the 
reading level of the client, estimating and, if 
possible, knowing the approximate reading 
level of the client and the readability level of 
the test seems necessary to insure reasonably 
accurate test selection and interpretation. 

Clues which may give a general indication 
of the client's reading level are the following: 
(a) Results of a test of mental ability, if avail- 
able; (b) Vocabulary level of the client as 
revealed in the interview: (c) Client's facility 
at interpreting questionnaire material; (d) 
Content of client's correspondence; (e) Client’s 
educational level (use with caution, since an 
individual's reading level may vary as much 
as six grades from his educational level). 

Observation by the psychometrist or coun- 
selor of the client's behavior in the test room 
Situation may prove very meaningful in de- 
tecting clues as to the reading level and type 
of reading disability characterizing the client. 
These clues may consist of the following: (a) 
Excessive articulation and lip reading; (b) Use 
vf crutches in reading (pointing with fin 


u ger and 
pencil); (c) Frequent regression in 


reading 


Ralph H. Johnson and Guy L. Bond 


passage; (d) Excessive fixations per line; (e) 
Frequent consultation with psychometrist re- 
garding comprehension of test directions and 
items; (f) Speed of reading; (g) Discrepancies 
between verbal and non-verbal tests and dis- 
crepancies between subtests; (h) Complaints 
of tired eyes. 

A quick check of the difficulty of a passage 
ior an individual is the number of words he 
misunderstands and misses orally out of 20. 
Betts (4) states that if a student in the ele- 
mentary or secondary grade level misunder- 
stands over one word in 20, he may lose the 
meaning of the passage. This type of check 
may prove efficient in a testing situation. 
there is sufficient time, group or individual 
reading survey tests would give the desired 
information. "m 

Assuming the counselor's selection of indi- 
vidual tests in a test battery is made to con- 
form with the client's reading level, one must 
make the further assumption that the noun 
selors have information regarding the reading 
level of various vocational tests. . 

The authors have felt that more information 
was needed regarding the reading level = 
various tests which are being given rather 
routinely to academic and general population 
groups, and have attempted to measure asi 
reading level by applying the Flesch formula 
the more commonly used tests in vocation’ 
counseling (2, 3). In most instances a 
formula was applied to the directions 5€P^ 
rately. ‘el 

Formulas for measuring the readability le" = 
(comprehension) of grade school text ne 
have been in use for the past 25 years = 
Klare (17) states that to date there are b? 
34 formulas or methods available. sited 

The Flesch formula was used in this limit h- 
survey for the following reasons: It is 2? pel 
cient formula; its author (9, 11) claims ae 
been used with success in sampling the E ali 
level of adult reading materials; it has : ve 
applied experimentally to the readability ^^. 
of Public Opinion Questionnaires (49> 
Klare (17) states that it correlates signif 
with other formulas. Flesch's revised for er 
(11) is based on the number of syllables Pn 
100 words and the average sentence leng j 
words. Scores resulting from this or ne 
vary from 100 to zero. A score of 100 ^? 


Reading Ease of Commonly Used Tests 


sponds to the prediction that a child who has 
completed the fourth grade will be able to 
answer 34 of test questions asked about a 
passage that is being rated. A score of zero 
indicates that the passage is practically 
unreadable. 

For further information regarding develop- 
ment, standardization, reliability, and validity 
of this formula, the reader is referred to publica- 
tions by Flesch (9, 10, 11, 12). 

The application of the formula to some 
commonly used vocational tests gave the re- 
sults indicated by Tables 2 and 3. Table 2 
represents tests for which an overall selected 
sample was used in determining readability 
level. Due to the spiral nature of the tests 
listed in Table 3, three separate measures of 
reading ease were made. 

In determining the Reading Ease of multiple 
Choice test items, only the correct response was 
counted. Had all five of the possible re- 
Sponses been included, sentence length and 
Word length would have been increased, re- 
Sulting in a lower (more difficult) Reading 
Ease score. Therefore, the results listed in 
Tables 2 and 3 tend to represent a conservative 


Table 2 


Over-all Readability Levels of Selected Tests 
as Determined by Application of 
the Flesch Formula 


Reading Grade 


Test Ease Level 

Bennett Mechanical Comprehension 90 55 

Minnesota Multiphasic Personality 
Inventory 88 6.0 
irections for Minnesota Clerical 87 6.0 
Bel Adjustment Inventory 80 n 
Directions for Bell 61 9.5 
aT erre s 4 
California Interest Inventory 65 9.0° 
Kuder Occupational Preference Record 60 9.5 
-ections for Kuder 70 8.0 
College G.E.D. No. 2 59 10.0* 
i - pr 
Ohio State Psychological Test (Part 3) 37 isi 
‘rong Vocational Interest Blank 35 ps 
Irections for Stron 73 -5 
E c 

Allport-Vernon 35 15.0" 
"rections for Allport-Vernon 60 9.5 


ch's corrected 


* 
Starred t Fles 
grade scores represen ation beyond 


Ns Placement for the area of extrapol 
* 7th grade, 


321 


estimate of the reading difficulty of the test 
items sampled. 

The authors recognize that the application 
of the Flesch formula to test items in voca- 
tional counseling test batteries was not the 
purpose for which the formula was designed. 
"Therefore, the following definite limitations of 
the formula must be considered in any inter- 
pretations resulting from its application: (a) 
Reading grade placement scores above grade 7 
represent estimated corrections for area of 
extrapolation beyond grade 7; (b) The for- 
mula does not appear to measure the effect of 
complexity caused by double negatives in a 
test such as the Minnesota Multiphasic Per- 
sonality Inventory; (c) The formula does not 
appear to measure the rather involved direc- 
tions in some parts of the Strong Interest Test; 
(d) The above results have not been verified 
by any extended sampling and, with the excep- 
tion of the interest test items, sampling is 
limited to test items that are complete sen- 
tences; (e) The formula is not designed to 
measure the readability level of test items 
that are not complete sentences. "Therefore, 
the readability level of the Strong, Kuder, and 
California Interest Tests, as indicated in Table 
2, can be considered valid only in the sense that 
they indicate relative difficulty of the several 
tests. 

It appears that a formula which measures 
word complexity and abstractness may be 
more appropriate in measuring the readability 
level of interest test items which contain single 
words and phrases. The authors did not at- 
tempt to apply a formula of this nature to the 
interest tests. However, the Lewerenz for- 
mula, which is based on vocabulary alone, was 
used by Stefilre (21) in computing the read- 
ing difficulty of interest inventories and, with 
the exception of the results on the Study of 
Values, there appears to be relative general 
agreement on the readability level of interest 
tests as measured by the Flesch and Lewerenz 
formulas. It is of interest to note that Auker- 
man (1) questions the importance of specific 
vocabulary ability in reading situations, since 
his study indicated that good students were 
significantly superior to poor students in 
general reading ability, and the fact that good 
students and poor students were not signifi- 
cantly different in either general or specific 


8953 Ralph H. Johnson and Guy L. Bond 
Table 3 
Grade and Reading Ease Level of Some Intelligence Tests as Determined by the 
Application of the Flesch Formula 
Directions First Items Middle Items Last Items. 
RE. G.L. RE. G.L. RE. GL RE GL 
1. AG.CT. 93 $55 E 3 
a. Block counting 91 5.5 91 5.5 91 Y 
b. Arithmetic items - 78 7.0 86 6.5 85 63 
c. Vocabulary items 86 6.5 59  100* 62 9.0 
2. Henmon-Nelson, Form A, 
Grades 7-12 96 5.0 * 
a. Non-arithmetic items 86 6.5 56  110* 43 140 
b. Arithmetic items 96 5.0 
3. Kuhlman-Anderson 
(Grade 6) Oral 
a. Sample I 95 5 76 7.5 
b. Sample II 95 z 16 17.8 
E T 
4. Otis Higher Exam, Form C 81 7.0 80 70 71 8.0 so 130 
5, Terman-McNemar,!Form D 86 6.5 
a. Information test 90 55 ag E 58 10.0* 
: 3 Š : * 
b. Logical selection 64 90* 3 150 10 n 
c. Analogies 87 60 80 T. 
d. Best answers 62 90* 63 9.0* 


* Starred grade scores represent Flesch’: 
beyond the 7th grade. Reading grade place; 
conversion tables. 


vocabulary ability indicates that knowledge of 
words is less important than reading ability 
as a whole. 

A limited check on the accuracy of the 
Flesch formula was made by applying the 
Lorge (18) formula, to the Minnesota Multi- 
phasic Personality Inventory and the Dale. 
Chall (8) formula to a sample of the informa- 
tion section of the Terman-McNemar Intel- 
ligence Test. The application of the Lorge 
formula to the same sampling of the MMPI 
resulted in a grade score of 6.1, which is in 
agreement with results on the Flesch formula. 
The Lorge formula is based on average sen- 
tence length, number of prepositional phrases 
per 100 words, and number of difficult words 
not appearing in Dale’s list of 769 easy words 
The application of the Dale-Chall formula to 
the Terman-McNemar Information Test es 
& corrected grade level score of 12, which is 
somewhat higher than the grade score arrived 


5 estimated corrected 
ment scores indicated 1 


EL lation 
grade placement for the area of extraDog 11) 


by Tables 2 and 3 derived from Flesch's (^ 


at for the same sample (last items) Lon 
application of the Flesch formula. 
Chall formula is based on sentence length on 
number of difficult words not appeariné 
Dale’s list of 3000 familiar words. 
Scrutiny of Tables 2 and 3 suggests 
observations which may have implic 
value to counselors if the results in t 
can be accepted as approximating the te 
levels indicated. In some instances he 
directions gave a. reading grade score below sts, 
readability level of test items. Interest ^re 
particularly the Strong Interest Test 2° ding 
Study of Values, appear to be at 2 nine 
level considerably higher than that peces 


tentative 
ations 9 
he tables 
ading 
t 


ra 
by the mean of the high school or Ely 
population. The relatively high a pected 


level of the Strong Interest Test, aS T° 
by the Flesch and Lewerenz formulas, 8097 
be in accord with the educational le" 
Strong’s Occupational Criterion GrouP® 


to 


ince 


Reading Ease of Commonly Used Tests 


his groups are considerably above the mean 
of the general population. Strong (22) lists 
the average educational level of each occupa- 
tional criterion group, and they range from 
10.4 to 19.0. The mean educational attain- 
ment of 35 out of 39 of Strong's Occupational 
Criterion Groups was 14.5, and the standard 
deviation of the group was 2.4. 

The Kuder Preference Inventory appears to 
have à reading level above that of the average 
junior high school student. Accordingly, it is 
of interest that a study by Christensen (7) 
indicated that 9th grade pupils of mean average 
intelligence had erroneous ideas concerning the 
meanings attached to 21 key words in the 
Kuder Preference Record. After instruction 
concerning the meaning of these terms, the 
results indicated that such instruction prob- 
ably played a role in causing subjects to change 
their preferences. 

The Bell and the MMPI personality tests 
have reading comprehension levels which 
Would appear to be understood by most adults 
and junior high school groups. 

Some of the commonly used intelligence 
tests measured appear to be at a reading level 
above that of the group for which the tests were 
designed and also appear to be spiral power 
tests which may to some degree be measuring 
reading skills. Center and Persons (6) found 
that after one semester of remedial reading 
Instruction, some pupils gained from 12 to 17 
IQ points on the Terman Group Test of Mental 
Ability, A class of 40 pupils made an average 
Bain of 4.3 IQ points under the above condi- 
tions. The Otis and the Kuhlmann-Anderson 
Seem to more nearly match the reading levels of 
the groups for which they were designed. 

The block counting and arithmetic items of 
the AGCT appear to be at the reading level of 
adults whose mental age is 13 and above. 

hese two subtests of the AGCT would be in 
accord with the reading level of most junior 
and senior high school students. The reading 
level of the entire test would not be too difficult 

9r most senior high school students and above 
average adults, since the most difficult section 


à reading level of about grade 10. How- 
Ver, it must be remembered that over one half 
vel and 


o : 
e the adult population is below this le 


5 : 
Y be mismeasured. n 
ome non-arithmetic items of the Henmo 


323. 


Nelson test and the logieal selection of Terman- 
McNemar are at about the college reading 
level. That is, about grade 14. 


Summary 


Assuming that the above observations are 
generally correct, the findings of applying 
reading ease formulas to directions and items 
of commonly used tests may be summed up 
as follows: 


1. Counselors, psychometrists, teachers and 
personnel technicians may have to re-evaluate 
their past and present practices regarding test 
selection, administration and interpretation. 

2. Psychometrists and counselors cannot 
assume that if a client comprehends the direc- 
tions for a test he will necessarily comprehend 
the test items. 

3. There is need for interest inventories that 
junior high school students and the general 
population can easily comprehend. 

4. The reading level of the Bell and MMPI 
tests appears to be well adapted to most of the 
general population and most junior high school 
groups. However, the variability in reading 
ability in these groups would indicate that even 
these tests would not adequately measure the 
clients in the lower end of the reading distri- 
bution. 

5. The Army General Classification Test 
would appear to have possibilities for more 
extensive use at the junior and senior high 
school levels and for use with general popula- 
tion groups. It also has possibilities for use 
as a substitute for individual tests of mental 
ability. This conclusion is fortified by a recent 
study by Tamminen (23) which shows a cor- 
relation of .83 between the Wechsler-Bellevue 
and the AGCT. 

6. There is need for group tests of mental 
ability that are not affected by an individual’s 
abilities in reading. It appears that some of 
our commonly used intelligence tests tend to 
favor those individuals who have, because of 
their environment, attained a high degree of 
reading skill. If we are to encourage and 
assist in the development of an individual’s 
natural potentialities, we need reliable means 
of measuring his real-innate intelligence, no 
matter how inadequate his reading skills may: 
be. Our country’s brain power is likely its. 


324 


greatest resource and certainly every effort 
should be made to discover and develop it. 


Received November 28, 1940. 


References 


1. Aukerman, R. C., Jr. Differences in the reading 
status of good and poor eleventh grade students, 
J. educ. Res., 1948, 41, 498-515. 

2. Baker, G., and Peatman, J. G. Tests used in Vet- 
erans Administration advisement units. 
J. Psychol., 1947, 49, 99-102. 

3. Berkshire, J. R., Bugental, J. F., Cassens, F. P. 
Test preferences in guidance centers. Occupa- 
tions, 1948, 26, 337-343. 

4, Betts, E. A. Foundations of reading instructions. 
New York: American Book Co., 1946. Pp. 445- 
485. 

5. Bond, G. L. Identifying the reading attainments 
and needs of students. Yearb. nat. Soc. Stud. 
Educ., 1948, 47 (2), 224-249. 

6. Center, S. S., and Persons, G. L. Teaching high 
school students to read. New York: Appleton- 
Century Co., 1937. 

7. Christensen, T. E. Some observations with re- 
spect to the Kuder Preference Record. J. educ. 
Res., 1946, 40, 96-107. 

8. Dale, E. and Chall, J. Formula fo 
readability instructions, 
State Univ., 1948, 27 (2). 

9. Flesch, R. Marks of readable style: a study in adult 
education. New York: Bur. of Publ., Teachers 
Coll., Columbia Univ., 1943. (Contr. to Educ. 
No. 897.) 

10. Flesch, R. The art of plain talk. New York: 
Harper and Brothers, 1946, 


Amer. 


r predicting 
Educ. Res. Bull., Ohio 


11. 


12. 
13. 


14. 


15, 


16. 


17. 


18. 


19. 
20. 


21. 
22. 


23. 


24, 


Ralph H. Johnson and Guy L. Bond 


Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 

Flesch, R. The art of readable writing. 
Harper and Brothers, 1949. A 

Gates, A. I. The improvement of reading. 
York: The Macmillan Co., 1947. Pp. 631. — 

Gray, W. S. The nature and extent of the p 
problems in American education. Educ. Recor 
Supplement, 1938, 19 (11), 87-104. -— 

Johnson, R. H. The problem of veterans’ rea E. 
level in the counseling process. Minneso 
Counselor, 1948, 3 (4). 

Klare, G. R. Understandability and 
answers to public opinion questions. 
Opin. Attitude Res., 1950, 4, 91-96. 

Klare, G. R. Evaluation of quantitat derer 
of comprehensibility in written communicat one 
Unpublished Ph.D. thesis, 1950, Univ. of Min 
sota. 

Lorge, I. Predicting readability. Teach. Coll. 
Rec., 1944, 45, 404-419. ssion 

Lorge, I., and Blau, R. Reading comprehen 198. 
of adults. Teach. Coll. Rec., 1941, 43, 189- m" 

Monroe, M. Children who cannot read. Ce 
The University of Chicago Press, 1932. 
10-12. 

Stefflre, B. The reading difficulty of interest 
tories. Occupations, 1947, 26, 95-96. and 

Strong, E. K., Jr. Vocational interests of me" 3, 
women. Stanford: Stanford Univ. Press 
Pp. 694-702. 

Tamminen, A. W. A comparison of t 
general classification test and the V i 
Bellevue intelligence scales. Unpublished ess Ol 

Witty, P. A. Current role and effectivener d. 
reading among youth. Yearb. nat. Soc. 
Educ., 1948, 47 (2). 


New York: 


New 


indefinite 
Int. J. 


ive indices 


inven” 


he army 
Vechsler- 


How Readable are Occupational Information Booklets? 


Arthur H. Brayfield and Patricia Aepli Reed 


University of California, Berkeley, California 


Dullness and incomprehensibility have never 
to our knowledge been postulated as twin goals 
for writers of occupational information. In- 
deed, the injunction to make such information 
interesting. and readily understandable has 
been sounded since the early days of the 
vocational guidance movement. 

"It (occupational information) needs to be 
put in terms that can be used by all kinds and 
conditions of people . . . it should be made 
available in very simple and popular form” 
Was the admonition of Charles R. Richards as 
early as 1915 (3, 514). In the mid-twenties 
Mary C. Schaufller wrote: “A study may be 
clear, definite, accurate, and brief and yet be 
dull reading, and on that account, may be of 
little value for classroom purposes. It is not 
enough that material be presented clearly and 
directly, it must also be presented in a manner 
Interesting and understandable to the group 
that is to use it” (4, 136). More recently, in a 
comprehensive treatment of occupational infor- 
mation published in 1946, Shartle expressed 
the belief that “While information must be 
carefully checked for accuracy, it must also be 
arranged and written in a style that is readable 
and easy to use” (5, 76). 

What are the facts? Have writers and 
Publishers of occupational information suc- 
ceeded in producing readable materials? The 
Writers have discovered no studies bearing on 
the problem; this paper reports a preliminary 
Mvestigation of the question “How readable 
1S occupational information?" 


Procedure 


en answer this question we applied the re- 
and | Flesch method of measuring readability 
cu human interest to sample passages from 

Trent occupational information literature. 

e he revised Flesch readability formulas were 

Scribed at some length in 1948 (1). Formula 
is p CSSentially a test of level of abstraction and 
din Ought to be an index of comprehension 
‘culty. Formula B predicts the effect of 


two “human interest" elements on comprehen- 
sion. Flesch considers its real value to lie in 
the fact that “human interest will also increase 
the reader's attention and his motivation for 
continued reading" (1, 226). 

An attempt was made to sample widely the 
current occupational literature. Publications 
coveringa variety of occupations and industries 
were included from many different publishing 
sources. A major difficulty encountered was 
the paucity of occupational information at the 
so-called lower job levels. 

Reading ease and human interest scores were 
determined for these materials as follows: (a) 
Five samples were chosen from each piece of 
writing; (b) Each sample was marked off to in- 
clude 100 words; and (c) The steps outlined by 
Flesch were taken to compute the readability 
and human interest scores. 


Results 


First, we analyzed 31 publications describing 
professional level occupations. The results are 
given in Table 1. The readability standards 
suggested by Flesch (1) were referred to for 
interpretation of the findings. 

Flesch groups reading ease scores into seven 
categories ranging from “Very easy” to “Very 
difficult” with a further description according 
to representative magazine publications rang- 
ing from "Comics" to “Scientific.” "The most 
striking fact obtained from this analysis is that 
65 per cent of the publications studied fall into 
the category “Very difficult” as represented by 
scientific journals and the remainder are classi- 
fied as “Difficult” or equivalent to academic 
magazines. 

With respect to human interest, Flesch de- 
scribes five categories of scores ranging from 
“Dull” to “Dramatic.” Typical magazines 
range from "Scientific? to “Fiction.” Of 
these 31 publications 71 per cent are judged by 
the Flesch formula for human interest to be 
“Dull” and appropriate to scientific journals. 
The remaining 23 per cent are rated in the 


325 


326 A. H. Brayfield and Patricia A. Reed 
Table 1 
“Reading Ease" and “Human Interest” Scores for Publications, Arranged by Source, 
Describing Selected Professional Level Occupations 
Occupation 
E g aaa p 
Source Lawyer Engineer Pharmacist Nurse Teacher 
“RE” “HL” “RE” “HL” “RE” “HI? “RE” “HL? “RE” “ELL” 
U. S. Department of Labor 7 4 14 5 36 5 4 4 12 8 
U.S. Employment Service 32 1 43 1 7 12 15 6 18 14 
"Science Research Associates 29 10 18 7 30 3 29 9 30 8 
Institute for Research 3 13 11 7 37 2 26 10 21 8 
Bellman Publishing Company 13 9 49 2 32 9 41 19 nd e 
‘Occupational Index, Inc. 34 9 3 1 e e = = 9 > 
Professional Organizations 20 4 = = 28 3 22 17 5 0 
peu 
adjacent category “Mildly interesting" on a categorized as “Dull” or scientific. The re- 


par with trade publications. 

It might be hypothesized that these findings 
are accounted for by the complexity and tech- 
nical nature of professional occupations. The 
findings of our study of a limited sampling of 
skilled and semi-skilled occupations lend only 
slight support to this contention. 

The data of Table 2 indicate that these 19 
publications are slightly more readable than 
those for professional occupations: only 53 per 
cent fall into the “Very difficult" or scientific 
classification for reading ease. However, none 
of the samples rated less than “Difficult” or 
academic, which is the next category on the 
Flesch scale. 

Results for the human interest analysis 
follow the same pattern: 68 per cent may be 


mainder, with one exception, are comparable 
to trade journals or “Mildly interesting. 
exception barely rates as "Interesting." |. ác 
Industrial as well as occupational classifica 
tions were studied. The results for ee 
ease of these 16 publications dealing 2 
entire industries are reported in Table 
They are similar to the previous findings: A 
these, 62 per cent rank as “Very difficult. | 
additional one fourth rated “Difficult, lity 
was only “Fairly difficult" similar to oe the 
magazines, none were “Standard,” and In try 
remaining publication the hotel aout ch 
actually is described in a Science RE it 
Associates pamphlet in a way which a 


H H i CET 
a description as “Fairly easy," like sli 
fiction. 


' one 


| 
| 


Table 2 


“Reading Ease" and “Human Interest” icati 
Describing Selected Skilled a: Sanres for Publications, Arranged by Source, 


- nd Semi-Skilled Level Occupations _ = 

— = Occupation Ea 

Source Secretary Plumber Welder mm apr: ~ Domestic - 

= p aeri er Truck Driv m —— n 

RE! "HI" "RE? HI” RE” "HIS “RE” “HL” ES “Eo 
U.S. Department of Labor 39 4 5 0 = —— = ex 
U. S. Employment Service = ae 29 1s 32 1 — == i Dos 
‘Science Research Associates 35 19 46 12 22 8 — = 40 14 

Institute for Research 31 20 20 49 4 31 17 : — i 

Bellman Publishing Company 23 1 = 9 = -— = = — 
‘Occupational Index, Inc. 3 Š zl = oa — - = " 5 
Commonwealth Book Company — = 1 " 19 7 15 8 2 si 

á 31 4 = z si a= 


How Readable are Occupational Information Booklets? 


327 


Table 3 " 


“Reading Ease" and “Human Interest" Scores for Publications, Arranged by Source, 
Describing Selected Industries 


Industry 

Source Retail Motion Picture Real Estate Hotels Iron and Steel 

UROES EET? URBES “HI” RE” HL” “RE” “AL "RIED HI” 
Science Research Associates 52 13 23 16 25 10 76 16 41 3 
Institute for Research — -— 3 1 12 14 — — = — 
Bellman Publishing Company — 8 14 31 15 = = 23 14 30 7 
Occupational Index, Inc. 16 4 — = 10 9 26 20 = = 
Commonwealth Book Company 16 9 = = 
Western Personnel Institute = = 42 13 Em = = = = — 


Human interest values are slightly more 
favorable for the potential consumer of these 
Publications. Only 37 per cent rate at the 
extreme as “Dull.” However, all but one of 
the rest fall into the adjacent “Mildly inter- 
esting” classification. The single exception is 
Judged “Interesting” like digest magazines. 

It was not a prime objective of this investi- 
gation to make comparisons among publishers 
With the one exception described in the next 
Paragraph. It may be noted in passing though 
that there do not appear to be any significant 
trends. No differences of any magnitude in 
readability are suggested by these data al- 
though they might appear if the entire series 
of publications from each source were studied 
5 a similar manner. Actually none of these 
Pr Oates appears to be producing materials 
ti ich begin to meet Flesch standards indica- 

ve of easy and interesting reading. 

n analysis was made of the publications of 

USIness and industrial concerns themselves to 
oe whether or not the highly ballyhooed 

Vertising practices of “big business" have m- 
imi readability of the occupational 
M iium supplied by such firms. The re- 
tion o; the study of the occupational informa- 
port materials of twelve companies are re 
Baie in Table 4. These include such repre- 
«native titles as “What Shell Means to You,” 
ab PPortunities for Employment,” and “What 

gut Your Future?” 

actually, 83 per cent rate as “Very difficult." 

remaining two publications are split be- 


tw, 5 
een “Difficult” and “Fairly difficult.” 


Two thirds are judged “Dull.” One fourth 
are “Mildly interesting" and only one publica- 
tion achieves a designation as “Interesting.” 


Summary 


Tn all, 78 pieces of occupational information 
literature from 24 different sources were ana- 
lyzed. Almost two thirds ranked as “Very 
difficult" or at the scientific level with respect 
to reading ease while another 32 per cent were 
ranked "Difficult." 

Almost exactly the same proportions held for 
the categories “Dull” and “Mildly interesting" 
when human interest scores were studied, 


Table 4 


“Reading Ease” and “Human Interest” Scores for 
Occupational Information Publications of 
Private Business and Industry 


" i c 
Source koe i M 

Burroughs 20 5 
Chase National Bank 29 15 
Corning Glass Works 23 6 
Equitable Life Insurance 14 9 
General Motors 15 17 
J. C. Penney 51 36 
Proctor and Gamble 7 9 
Roos Bros. Department Store 36 11 
Shell Oil Company 23 9 
Union Oil Company 19 9 
United Air Lines 21 1 

29 3 


United States Steel 


328 


Fewer than five per cent of these publications 
reach the readability level of the popular “digest” 
magazines. 

Within the limitations of the Flesch formulas 
for measuring readability and the limitations 
of our sampling of occupations, industries, and 
publishers we conclude that current occupa- 
tional information falls far short of meeting 
the requirements for comprehension and in- 
terest which have been suggested through the 
years by persons intimately concerned with 
the preparation and use of such information. 
Dullness and incomprehensibility reign su- 
preme. Writers and publishers of occupa- 
tional information might well consult Flesch 


A. H. Brayfield and Patricia A. Reed 


(2) and others if they are seriously interested 
in “The Art of Readable Writing.” 
Received October 28, 1949. 


References. 
1. Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 
2. Flesch, R. The art of readable wriling. 
Harper and Brothers, 1949. 
3. Richards, C.R. What we need to know about occu- 


New York: 


pations. In M. Bloomfield (Ed.), Readings vit 
vocational guidance. Boston: Ginn and Com 
pany, 1915. Pp. 504-514. 


4. Schauffler, M. C. Standards for evaluating occupa 
tional studies for a critical bibliography. 3 
F. J. Allen (Ed.), Practice in vocational pida 
New York: McGraw-Hill, 1927. Pp. 132-195. 

. Shartle, C. Occupational information. New York: 
Prentice-Hall, 1946. 


p 


Validity of an Emotional Key on a Short Industrial 
Personality Questionnaire * 


Mildred B. Mitchell 
VA Mental Hygiene Clinic, Ft. Snelling, St. Paul, Minnesota 


and 
Harold F. Rothe 


Stevenson, Jordan & Harrison, Inc., Chicago, Illinois 


This is the third in a series of papers dealing 
with the Objectivity score on a short person- 
ality questionnaire designed for industrial use. 

he questionnaire is restricted and is not avail- 
able for general use, but the principles involved 
are more important than is the particular form. 
Tn the first paper (3) the questionnaire was de- 
Scribed and the use of the Objectivity key, 
patterned after the L scale of the MMPI (2), 
was described. It was shown there that 

motionality scores, in particular, seemed to 
be related to Objectivity scores when the 
questionnaire was administered to industrial 
Personnel. 

The information presented in that paper and 
the use of that information as described is of 
No value unless it can be shown that the Ob- 
Jectivity key is a valid measure of “faking” and 
unless the other keys are also valid for measur- 
ing various personality characteristics. A 
Second paper described an experiment with 
college students in which it was shown that the 

qpjectivity key, and other keys could be 

aked,” if the respondents so desired (1). 
> king the questionnaire to “look good” 
tivit ted in low Emotional and also low Objec- 
t ily scores that gave the interviewer 2 clue 
scd faking” might be present. Thus it was 
Sie. uded that the Objectivity key is a valid 
s mid what it is intended to measure. 

€ described in the first paper, the Objectivity 
aey is interpreted in industrial practice to 
LUE highly objective persons rather than 

Ocate “fakers.” 
n Purpose of the present paper is to 
ent evidence on the validity of the Emo- 


with permission of the Chief Medical 


Department of Medicine and Surgery, | et. 
no responsibility 


lusions drawn by 


* 
Di, Published 
eri ector, 
ans p 
for the p ministration, who assumes 
the authors ions expressed or the conc 


tionality key. This key is shown to be a valid 
measure of maladjustment. Accordingly the 
validity of the procedure of using the Objec- 
tivity scores to determine the set of norms to 
use in interpreting the Emotionality key on a 
short personality questionnaire (50 items) in 
industrial screening is demonstrated.! 


The Criterion Group 


For purposes of this study the personality 
questionnaire was administered routinely to a 
sample of 100 male patients consecutively ad- 
mitted to the Mental Hygiene Clinic of a 
Veterans Administration Center. All of the 
respondents had been admitted to the clinic as 
patients, and this fact is used as the criterion 
of their maladjustment. The specific nature 
of their maladjustments is unimportant here 
because the Emotionality key under discussion 
is a screening key and does not yield responses 
that are diagnostically refined. 

For comparative purposes a sample of 100 in- 
dustrial employees was selected from master 
data sheets of test scores. Every fifth man was 
selected, beginning with the most recent entry 
on the data sheets and proceeding until a 
sample of 100 cases was obtained. The two 
groups, clinical and industrial, are accordingly 
not matched in any respect except sex. 


Results 


Complete distributions of the scores of these 
two groups together with the means and 
standard deviations on each of four keys 
(Objectivity, Emotionality, Social Dominance, 

1The writers wish to thank Miss Judy Yackle for 
assisting in the preparation of this paper, and the Psy- 
chological Trainees, particularly Arthur Schomp and 


Harold Gilberstad, at the VA Mental Hygiene Clinic, « 
for helping to collect the data on patients. 


329 


330 


Table 1 


Distributions of Objectivity Scores of 100 Clinical 
Patients and 100 Industrial Employees 


Objectivity Clinical Industrial 

Score Patients Employees 

0 5 5 

1 19 20. 

2 22 23 

3 26 22 

4 18 18 

5 6 10 

6 4 2 
Mean 24 2.7 
S.D. 15 15 


and Drive) are presented in Tables 1, 2, 3, 
and 4. 

The Objectivity Key. The two groups were 
surprisingly well matched when their Objec- 
tivity scores were compared. The mean score 
for each group was 2.7 and the S.D. for each 
group was 1.5, as is shown in Table 1. The 
critical ratio of the differences between the 
means of these distributions is .05, which is a 
non-significant difference. 

The Emotionality Key. The two groups 
were quite disparate on the Emotionality key, 
as shown in Table 2. The mean score for the 


'Table 2 


Distributions of Emotionality Scores of 100 Clinical 
Patients and 100 Industrial Employees 


Emotionality Clinical Industrial 
Score Patients Employees 
0 0 0 
1 1 7 
2 3 11 
3 2 22 
4 $ 11 
$ 9 18 
6 8 8 
7 ti 8 
8 18 6 
9 14 9 
10 14 0 
11 pi 0 
12 6 0 
13 2 0 
Mean 7.9 4.6 
S.D. 2.6 2. 


Mildred B. Mitchell and Harold F. Rothe 


clinical sample was 7.9, with an S.D. of 2.6. 
The mean for the industrial sample was 4.6, 
with an S.D. of 2.3. The critical ratio be: 
tween these two samples was 9.3. hence it 38 
concluded that the difference is a real one. 
Since the clinical sample was maladjusted 
according to an outside criterion, and since wa 
industrial sample was not known to be par 
adjusted, it is concluded that the Emotionality 
key is a valid measure of maladjustment. — 
The correlation between the Emotionality 
and the Objectivity scores for this oime 
sample was .25, which is small enough to p 
little or no practical significance. This 18 “i : 
important point since, as previously shown. DY 
a co-variation of distributions, these ene 
rather highly related for industrial or col i 
samples. For the industrial sample dise. 
in this paper, the correlation between the 
two keys was .52. T" 
Thus it is shown that there is not a? in| 
evitable high degree of relationship between abe 
Objectivity and the Emotionality uu d 
tained on this questionnaire. The fact hee 
such a relationship does occur with qe. Cá 
samples, and does occur when college stue s 


oy 2 »? | ther 
take the questionnaire and “fake” it, enl 
strengthens the belief that low Ob fak- 


scores are indicative of the existence 0t form 
ing." Stating this in terms whereby en th- 
is used in actual practice, the belief is eerie 
ened that persons who obtain high Object ja 
scores and high Emotional scores in ind 
interviewing situations should not necessiti 
be considered maladjusted. Rather, js 0 
scores should be interpreted on the B ity 
norms based upon persons of high Objet res 
scores. Persons with low Objectivity ê 
and high Emotionality scores may be SU°F ciy 
of poor adjustment, requiring more m 
interviewing and testing procedures. 
tionship of Objectivity scores to Emo one 
scores does not appear to be a spurious 
industrial practice. - on 
The Social Dominance and Drive EM ient? 
the Social Dominance scale the clinical jf strial 
obtained lower scores than did the oat 3. 
sample. These data are presented 1n Ei 2 
The clinical mean was 4.9, with an 9L p, of 
The industrial mean was 6.2, with an ~" two 


€ 
1.7. The critical ratio between ' 


Validity of an Emotional Key on a Personality Questionnaire 331 


Table 3 


Distributions of Social Dominance Scores of 100 Clinical 
Patients and 100 Industrial Employees 


Social 
Dominance Clinical Industrial 

Score Patients Employees 

0 0 0 

1 8 2 

2 11 1 

3 9 2 

4 16 12 

5 13 16 

6 16 17 

7 16 27 

8 6 18 

9 5 5 
Mean 4.9 6.2 
S.D. EE 17 


Broups was 4.69, which indicates a significant 
difference, 

On the Drive scale also the clinical patients 
Were lower than the industrial sample as shown 
in Table 4. The clinical mean was 4.9 and 
the S.D. was 1.5. The industrial mean was 
5.8 and the S.D. was 1.6. The critical ratio 
tween the two groups was 3.52, which is 
Again a significant difference. » 

Thus the clinical patients differed signifi- 
cantly from the industrial sample on three keys: 
“Motionality, Social Dominance, and Drive; 


Table 4 


Distributions of Drive Score of 100 Clinical Patients 
and 100 Industrial Employees 


= = 

Drive Clinical Industrial 
Score Patients Employees 

0 0 0 

1 ^ 1 

2 2 1 

3 ia 8 

4 23 12 

5 27 17 

6 m 28 

7 20 

12 

8 7 9 

9 0 4 
Mean 5.8 

4.9 F 

S.D. 15 1.6 


2 


but did not differ on Objectivity. The pre- 
vious paper by Carr and Rothe (1), showed that 
students, when “faking to look good,” obtained 
low Objectivity scores, low Emotionality scores, 
high Social Dominance scores, and unchanged 
Drive scores; when “faking to look bad" the 
students obtained high Objectivity scores, high 
Emotionality scores, low Social Dominance 
scores and low Drive scores. Thus the stu- 
dents, when “faking” could alter their scores 
on all three keys, but their extreme Objectivity 
scores revealed that they were "faking" and 
made possible an adjustment in the interpre- 
tation of their scores. 

The clinical patients were apparently *nor- 
mal" in their Objectivity, but low in Social 
Dominance and low in Drive. Accordingly, it 
is concluded that these patients as a group were 
actually low in those two characteristics. 
There is no reason to suspect that they were 
“faking” their scores on those two keys, par- 
ticularly since they ranked on the low, and for 
many positions the undesirable, ends of the 


scales. 
Conclusions 


Data are presented in this paper showing 
that clinical patients who may be considered 
neurotic obtained significantly high scores on 
the Emotionality key of this short personality 
questionnaire, when contrasted with a random 
sample of industrial personnel. Itis concluded 
from these data that this form is a valid 
screening device for indicating neuroticism. 
The form used here is not available for general 
use, since it has been developed by one com- 
pany to aid its interviewing procedures. But 
the principles described here have broader im- 
plications. These are: (1) that a very short 
personality questionnaire for industrial pur- 
poses can be a valid screening device that will 
indicate neuroticism; and (2) that the use of an 
Objectivity key, similar to the L scale of the 
MMPI, can be a valid device that will aid in 
the interpretation of scores. It is apparent 
that other personality questionnaires can be 
made that will be as valid and as short as this 
one, and hence the restriction of the distribu- 
tion of this form is actually not a serious 


handicap. 


Received November 22, 1949. 


332 Mildred B. Mitchell and Harold F. Rothe 


References 


1. Carr, E. R., and Rothe, H. F. Validity of an ob- 
jectivity key on a short industrial personality 
questionnaire. J. appl. Psychol., 1950, 34, 178- 
181. 


2. Hathaway, S. R., and McKinley, J. C. Ma 


nual for 


the Minnesota Multiphasic Personality Inventory. 
New York: The Psychological Corporation, 1943. 


3. Rothe, H. F. Use of an objectivity key on 
industrial personality questionnaire. < 
Psychol., 1950, 34, 98-101. 


a short 
J. appl. 


Overall Job Success as a Basis for Employee Ratings 


C. E. Jurgensen 
Minneapolis Gas Company 


The "halo" effect in merit rating is well 
known, and many different procedures have 
been devised to reduce its effect. In spite of 
Such attempts, intercorrelations have remained 
So high as to cast doubt on the value of ratings 
on separate factors. In some cases intercor- 
relations have actually been higher than trait 
teliabilities (4). "This has led some investi- 
gators (1, 5, 6) to recommend overall ratings 
variously called ‘performance on present job,” 

value to the company,” “overall job suc- 
cess,” etc, 

Overall ratings lack the specific information 
found in trait ratings, but this objection is more 
theoretical than practical if intercorrelations 

etween trait ratings are higher than their 
reliabilities. Overall ratings have certain ad- 
vantages over trait ratings. They are apt to 
agree with the foreman’s statements when he 
1S questioned regarding an employee, and they 
are apt to agree with promotions, transfers, 
demotions, discharges and other personnel 
Moves which are supposedly based on merit. 
Some persons condemn overall ratings on the 

asis that they are not analytical and may not 
€ based on objective evidence. Yet these 
Same persons may “validate” a trait scale on 

* basis of overall ratings or on promotions 

ased on overall judgments. It would appear, 
then, that many persons believe that overall 
Tatings are more valid than other means of 

termining employee merit and that some 
Persons who disagree with this viewpoint in 

Cory accept it in practice. 

Pag is report discusses two types of employee 
ings based on overall job success. 


Part L Rank Order Merit Ratings 


"S Tank order rating technique has s 
Ps comparatively little attention in the a 
Con: Ure of industrial psychology. The tu 
contin of ranking employees on à specifie 
e ‘uum from best to worst. In the cases 
Scribed here the continuum consisted of over- 


all merit. The name of each employee was 
typed on a 3X5" index card. The pack of 
name cards was given the rater with the infor- 
mation that names were in no particular order. 
Simultaneously, the pack of name cards was 
shuffled. Instructions were: “Please arrange 
these names in order from best to worst. 
Your best employee should be placed on top 
and the worst on the bottom. You can start 
from either end, or start from both ends and 
work toward the middle. You can make as 
many changes as you wish." 

Rank orders were subsequently converted to 
scale scores advocated by Hull (3) and Guilford 
(2). Intercorrelation between the two types of 
scale scores was .998 with a total of 115 cases, 
Only the Hull method was subsequently used 
and reported here. 


Case 1. Three foreladies ranked forty-four 
Inspector-Packers on an overall basis. Cor- 
relations between the raters were .88, .80, and 
.76. Stepped up reliability for the sum of the 
three ratings was .93., 

Case 2. Three foreladies ranked twelve In- 
spector-Packers on an overall basis. Correla- 
tions between raters were .89, .77, and .72. 
Stepped up reliability was .92. 

Case 3. Five foreladies ranked twenty- 
three Inspector-Packers on a overall basis. 
One month later the ranking was repeated 
without reference to the prior rankings. Cor- 
relations between the first and second rankings 
obtained for each forelady were .89, .88, .86, 
.76, and .73. Stepped up reliabilities for the 
sum of the two rankings by each forelady were 
94, .94, .93, .86, and .85. 

Intercorrelations were computed between 
the five raters. These are given in Table 1. 
Four of the five correlations lower than .80 in- 
volved forelady D who had been a forelady for 
less than one month and had had no previous 
rating experience or training. She was sub- 
sequently (and unrelated to these or other. 
ratings) demoted to a non-supervisory position. 


333 


334 


Table 1 


Intercorrelations Between Five Foreladies Ranking 
Twenty-Three Inspector-Packers 


A B C D 
A 
B 45 
C 84 81 
D 66 .65 75 
E 87 82 85 70 


Forelady B likewise had less than one month 
experience, but was subsequently considered 
a “good” forelady. 

Case 4. Three inexperienced  foreladies 
ranked thirty-four Inspector-Packers on an 
overall basis. Each forelady had been in 
supervisory work less than one month, and 
none had received any training in rating pur- 
poses or procedures. Intercorrelations 
tween their ratings were .54, .53, and .19. 

Case 5. An Assistant Sales Manager ranked 
fifty-two Salesmen on an overall basis of “value 
to the company as a salesman." To facilitate 
the ranking of a number as large as fifty-two, 
he was first asked to sort the name cards into 
three piles ("best," “average,” and “worst”) 
containing approximately the same number of 
names each. The three groups were then 
placed in rank order and merged. Changes of 
opinion in rank order were permitted until the 
judge was completely satisfied with his rank 
order. One month later the process was re- 
peated. The Pearsonian correlation between 
the repeated ratings (scaled scores) was .94, 


Stepped up reliability for the sum of the two 
ratings was .97. 


be- 


Rating reliability of this magnitude is so 
high as to be suspect. No evidence was found 
to indicate it was spuriously high or that 
similar results could not be obtained by other 
persons under the same conditions. It is 
believed that the high reliability obtained was 
due Primarily to four factors: (1) Rank order 
merit ratings on an overall basis may inherently 
be more reliable than more frequently used 
Procedures; (2) The rater was an exceedingly 
conscientious person and was highly motivated 
to be as accurate as possible in his ratings; (3) 
During the previous year the rater had re- 


C. E. Jurgensen 


ceived almost twenty hours of individual 
training in rating purposes, principles, ene 
procedures; and (4) The rater was well ed 
quainted with his subordinates, having wor e 
intimately with them for several years. 


Conclusions for Part I 


1. Ratings obtained from experienced super- 
visors were more reliable than those obtainet 
from inexperienced supervisors. — R 

2. Highest reliability was obtained H me 
supervisor who was highly motivated, og 
received individual training in rating, and be 
thoroughly familiar with the work of his su 
ordinates. a 

3. Rank order ratings on an overall Lei 
simple to obtain and can have a high degre 
reliability. 


Part II. Multiple Item Scale for 
Rating Overall Job Success 


E ibed 

The simple rank order method just dum. 
will not always be suitable. It "gna e 
ratings be made for a group of emp v as 
Sometimes a single employee must be rate : a 
when determining action to be taken order 
probationary employee. Although se may 
rating is not applicable, an overall d b- 
be desired. However, it is somewhat cis iing 
ing to base a decision on a single overall thesis 
in view of the commonly accepted hyp? ility 
that, other factors being equal, the m er of 
of a measurement is related to the num? 
items making up the measurement. 

This section deals with an attempt t° ists 
a four item scale in which each item x scale 
an overall rating of job success. gu pilities 
permits computation of split-half Te us on" 
for groups of employees and inter? 
sistency for a single employee. ting scale 

In developing a multiple item "em red 17 
for overall job success it was cons! m that 
portant: (1) to word items in such ii 10 
their identicalness was not obvious) | gories 
provide a varying number of rating d hec 
so that raters would not automatica y 3) 19 
the same category in each item; we em 
provide two open end questions 0? ints: * 
ployee's strong points and weak m . 
copy of such a scale is given in Figu” 


devise 
sots 0 


Overall Job Success as a Basis for Employee Ratings 


Name o 


Departzent 


E) Exceedingly well satisfied with employee 


[5] Well satisfied with enployee 


lly satisfied with employee 


Somewhat disappointed with employee 
[2d Quite disappointed with employee 
[2l us; disappointed with employee 


Where would you grade this employee in a large 
group of persons holding this same job? 


[2] Lowest 10% 
[3] Next 20% 


Middle 40% 


Next 20% 


Highest 10% 


Which of the following terms best describes the 
Overall job performance of this employee? 


Excelient, far exceeds job requirements 


bd coca, exceeds job requirements 


Lal ono plus, slightly above job requirements 


Average, meets job requirements 
ro — minus, slightly below Jj 
Bs]... partially meets job requirements 
Uvery poor, does not meet job requirements 


Fic. 1. Employee evaluation report ii 
based on 


se Procedur e. Twenty-one supervisors repre- 
) nting all divisions of the company filled in 405 
mut Evaluation. Reports. The a Sa 
visor er of employees rated by a single es : 
Wi Was thirty-seven and the smallest numbe 
as four, 
fie on each item were converted meo 
hera ed standard scores using the mid-po! s 
tile method. These scores were ex 


ployee, 


Rated by, 


ob requirements 


Position 


Date 


If at the time this employee was hired 
you knew everything you now know about 
him, would you have recommended his 
employment? 


Definitely no 


Probably no 


Probably yes 
Eg Definitely yes 


What do you consider to be the greatest 
strong points of this employee? 


What do you consider to be the greatest 
weaknesses of this employee? 


ee - 
ee 
ee 
od 
ee 
————————— 


ncluding the composite normalized score weights 
total of 810 ratings. 


pressed with a mean of 50 and a sigma of 10 
and comprise the weights given each rating. 
One month after the first ratings were obtained, 
the same supervisors again rated the same em- 
ployees. Standard score weights were com- 
puted as previously. 

The normalized score weights for the first 
ratings of 405 employees were compared with 
those of the repeated ratings. The mean dif- 


336 


ference wasless than .3. Figure 1 includes the 
composite normalized score weights based on 
the total N of 810. These weights apply 
strictly only to the raters and employees in- 
cluded in the study. The method of selecting 
raters and employees permit generalization 
throughout the company concerned, but 
weights cannot be assumed to be applicable 
to other companies. 

Scale Reliability. Inasmuch as weights are 
expressed in terms of standard scores, weights 
of two or more items can be added directly. 
'This permits computation of split half reli- 
ability where two items comprise one half and 
the remaining two items the other half. The 
reliability (stepped up, as usual, by the 
Spearman-Brown prophecy formula) was .94 
for the first set of ratings (N=405) and like- 
wise .94 for the second set of ratings (N — 405). 

Twelve supervisors each rated seventeen or 
more employees. Split half reliability was 
computed for each individual supervisor. 
These reliabilities ranged from .82 to .99 with 
a median of .96 for the first set of ratings. For 
the repeated set of ratings these reliabilities 
ranged from .87 to .98 with a median of .94. 

. Repeat reliability was .88 for the total of 405 
employees. For individual supervisors who 
rated seventeen or more employees, the reli- 
abilities ranged from .78 to .98 with a median 
of .92. 

Four supervisors each rated one group of 36 
employees, thus providing six intercorrelations 
For the first set of ratings they ranged from .60 
to .83 with a median of .71. For the repeated 
ratings they ranged from .67 to .84 with a 
median of .76. 

The above reliabilities folow the usual 
pattern of highest reliability for split halves 
next highest for repeated ratings, and lowest 
for ratings by different supervisors. All reli- 
abilities appear high as compared with reli- 
abilities typically obtained from ratin: 
This is particularly significant in view of um 
fact that no supervisor was given any traini e 
whatsoever in the use of this scale, and fe ing 
any, had received training in the use ie. 
rating scale. This procedure was deliberati 

followed to avoid possible difficulties res ith id 
from training on an experimental scale ME 
might subsequently be discarded. n 


; T. 
sumption was made that if the scale dod 


C. E Jurgensen 


to be satisfactory without any training it 
would be even more so with training. 

Ilem Reliability. Reliability of the scale as 
a whole is dependent on the reliability of each 
item in the scale. Item reliabilities were there- 
fore computed for the 405 repeated ratings by 
means of contingency coefficients corrected for 
errors of grouping due to small number of 
classes. The coefficients reported here can be 
considered equivalent to the customary Pear- 
sonian correlation. Repeat reliabilities of the 
four items (N=405) are: (a) .87; (b) -80; (Q) 
82; and (d) .90. 

Item Consistency. Item reliability can also 
be interpreted on the basis of item consistency: 

The percentage of employees (N 405) 
given identical ratings on two occasions ON 
each of the items was: (a) 77.0%; (b) 64.990; 
(c) 61.5%; and (d) 81.0%. Very few dis 
crepancies greater than one step were found 
for any item, such discrepancies being: (2 
1.0%; (b) 2.0%; (c) 2.2%; and (d) 776. Be- 
cause of the varying number of steps 1? ur 
scale items the above percentages are not 
strictly comparable. An index was therefore 
devised to express the number and extent ° 
inconsistencies, in relation to the maximum 
spread.! : 

The item indexes for repeated ratings 
(N=405) were: (a) .95; (b) .91; (c) 93; an 
(d) .93. Item indexes for ratings by S 
raters (N —432) based on all possible pairing i 
of 36 employees rated by four supervise 
were: (a) .94; (b) .88; (c) .92; and @ ^ 
Item consistency appears satisfactory» and 7 or 
one item stands out as appreciably super 
or inferior to any other item.” m 

Item Intercorrelations. The four items cO 


! The formula is: 


"-— _ Ni+2N2+ 3Ns °° + nNa 
nN(ou4243 = 0) 
where N = number of ratings with the ee 
ancy indicated by the subscript, and jmu 
Step discrepancy. The formula is at @ mat am 9^ 
00 when ratings are completely consistent "high a5 
minimum of .00 when each employee is rale i 


Possible at one time and as low as possible 
e; 


n= 


2s two 
Item b contained the phrase "next 20% " 
Places. "This proved deceptive to raters, eng o 

to have lowered the reliability and consistency d «gest 
item. It has subsequently been reworded to 102 lieve? 
lowest 20%” and “next highest 20%” and it 15 


that th sien Fy H improve 
iced isbablity of the item will be 1 p 


verall Job Success as a Basis for Employee Ratings 


prising the scale were intended to be four 
ratings of the same characteristic, namely an 
overall rating of job success. Split half reli- 
ability was found to be high, thus indicating 
that the two halves were equivalent. Further 
evidence on this point is obtained from inter- 
correlations of the four items. These range 
from .82 to .90 thus being essentially the same, 
and essentially the same magnitude as the 
repeat reliability of each item. 


Conclusions for Part II 


1. Without receiving any training in the use 
of the scale, twenty-one supervisors rated 405 
employees on two occasions on a four item 
Overall merit rating scale. Under these cir- 
cumstances split half reliability was .94, repeat 
reliability was .88 and correlation of ratings 
by different supervisors ranged from .60 to 
‘84 with n’s of 36 each. 

2. Item reliabilities and consistency indexes 
Were satisfactory for each of the four items. 


337 


3. Item intercorrelations were of essentially 
the same magnitude as repeat reliabilities. 

4. It is reasonable to believe that the results 
would be even more favorable if supervisors 
were trained in the use of the rating scale. It 
is therefore concluded that the technique of 
multiple overall ratings is a promising and 
practical technique. 

Received November 18, 1949. 


References 


. Ewart, E., Seashore, J. E., and Tiffin, J. A factor 
analysis of an industrial merit rating scale. J, 
appl. Psychol., 1941, 25, 481-486. 

. Guilford, J. P. Psychometric methods. New York: 
McGraw-Hill, 1936, pp. 250-251. 

3. Hull, C.L. Aptitude testing. Yonkers: World Book 
Company, 1928, pp. 382 fi. 

. Jurgensen, C. E. Intercorrelations in merit rating 
traits. J. appl. Psychol., 1950, 34, 240-243. 

..Lawshe, C. H., Kephart, N. C., and McCormick, 
E. J. The paired comparison technique for 
rating performance of industrial employees. J. 
appl. Psychol., 1949, 33, 69-77. 

. Pond, M. Success of factory workers. Occupations, 
1936, 14, 940-944. 


i) 


n e 


a 


A Comparison of the Terman-Miles M-F Test and the 
Mf Scale of the MMPI 


Olga E. de Cillis and William D. Orbison 


University of Connecticut 


Vocational counseling presents many occa- 
sions to use both the Terman-Miles Attitude- 
Interest Analysis Test and the Minnesota 
Multiphasic Personality Inventory. The use 
of these tests, however, has frequently revealed 
marked discrepancies in scores on the mascu- 
linity-femininity scales. These discrepancies, 
like the low correlations reported in the litera- 
ture for other masculinity-femininity scales (10, 
12), suggest the hypothesis that although the 
T-M and the MMPI tests may discriminate 
between the sexes, they are nevertheless mea- 
suring different aspects of masculinity-femin- 
inity. An opportunity? to test this hypothesis 
on a large group was provided when both tests 
were incorporated in a battery of tests being 
used provisionally in the School of Business 
Administration at The University of Con- 
necticut. 


Subjects 


The subjects used in this study were 129 
men and 50 women undergraduate students at 
The University of Connecticut. The 129 men 
were junior and senior students from the School 
of Business Administration enrolled in a course 
in industrial psychology. They ranged in age 
from 20 to 31; the median age was 24. The 
50 women were freshmen and sophomores in 
the College of Arts and Sciences enrolled in an 
introductory course in psychology. Their ages 
ranged from 17 to 25; the median age of the 
women was 18. 


Procedure 


The tests used were the T-M, Fo 
the group form of the MMPI. 
tration took place during class 
lapse of one to two weeks betwe 
istration. Presentation of the 


rm B and 

Test adminis- 
periods with a 
en each admin- 
tests was varied 


1 Hereafter the Terman-Miles Attitude-In 
sis Test will be referred to as T-M; the Mi » 
Multiphasic Personality Inventory as MMPI Aiea 

? Kindly provided by Dean Laurence J. Ackerman 


terest Analy- 


in AB, BA order. The nature and purpose à 
the study were concealed in order to avoid pe 
falsification to which masculinity-femininiy 
tests are particularly susceptible (Kelly, ue d 
and Terman (11), Gough (6), Benton (1), an 
Meehl and Hathaway (15)). M 
The MMPI was machine scored; the T d 
was hand scored. Since the chances " a 
making errors in hand scoring the T 
great, each test was hand scored by two di e : 
ent individuals. Errors revealed by differ 
ences in the two scorings were corrected. 


Results 


MMPI. The men in this sample ohun 
score somewhat more feminine than io e 
ported by Schmidt (17) for men in a non-co RA 
population (M=18.0) and somewhat ™ : 
feminine also than the mean for the age 
population. The manual reports a mean a 
score of 20.5 (9, p. 11) with a standard m á 
tion of 5.0; the present study has yielde 8 
mean raw score of 24.6, standard deviation ^* y 
(Table 1). The fact that the men in this ce 
obtain a ‘more feminine score is to qe 
pected since it has been demonstrated rms 
College men score above the published go: 
(Brown (2), Hathaway? (2)). 


w 
A n ra 
The data for women, as far as me 


mpare 


a Standard deviation of 5.0. 
a mean raw score of 37.1 and a standar! 
tion of 4.30 were obtained. The data i" 
investigation accord with the norms a5 V 
With the results of Verniaud (20) on ?. 
college population and with Brown (2^ 
ton (8), Lough (12), and Nance (16) o” 
Populations, rc 
T-M. The mean raw scores and € p. 
jwn Ver 


ell a5 
non 
amp” 
colleg® 


s "E 0 
an from personal communication in Br 


338 


Comparison of Terman-Miles M-F Test and Mf Scale of MMPI 


Table 1 


Means and Standard Deviations of. M-F Test Scores 


Mean 
T Raw Standard 
Sat N Scores Deviations 
MMPEO— Men 129 24.6 4.86 
A Women 50 314 4.30 
T-Mf Men 129 BI 4396 
Women 50 —32.6 4245 


"m 
fos db the MMPI larger raw scores mean greater 
of no inity for both sexes. The manual provides a set 
devi rms for each sex in which larger T scores indicate a 
viation in the direction of the opposite sex pattern. 
For the T-M test, larger positive raw scores mean 
greater masculinity; larger negative raw scores mean 
hoe femininity. In interpreting the percentile scores 
culinites t the percentile for men the greater the mas- 
gre: y; the larger the percentile score for women, the 
‘ater the masculinity. 


deviations for men and women on the T-M are 
Presented in Table 1. Judged by norms pre- 
Sented in the manual (18, p. 8), the males are 
fairly typical. A mean of 67.4 and standard 
deviation of 47.65 reported for 130 college men 
on Form A compare well with the obtained 
Mean of 73.7 and standard deviation of 43.96 
of this study. 

However, the women in this study prove 
Considerably more masculine than are similar 
8roups described in the manual (18, p. 5). The 
ea for college women on Form A is — 60. 
WI a standard deviation of 39.15 (18, p. 8). 
sc hen both forms are averaged the median 
"rid is — 65 (18, p. 13); the standard deviation 
"iw may be estimated from that table is 
sta; 5. Tn this study a mean of —32.6 with a 
(T: ndard deviation of 42.45 was obtained 

able 1), Evaluated by the T-M manual, 


339 


the women of this investigation score at the 
level of M.D. or Ph.D. women. 

Because the mean for women cited in the 
manual (18, p. 8) is not comparable with the 
mean of this study, a ¿ test was computed. 
The obtained value is 4.08 for 178 degrees of 
freedom. This difference is reliable beyond 
the .01 per cent level. Thus in regard to the 
T-M, the mean obtained for women in this 
study is significantly lower, hence more mascu- 
line than that of the normed population. The 
standard deviations, however, are about the 
same. The F ratio of the two variances is 1.17 
which is not significant. 

Differences between Men and Women on the 
T-M and MMPI. The differences and their 
significances for each test as a whole are shown 
in Table 2. All are significant beyond the .01 
per cent level of confidence. No less signifi- 
cant are differences between the sexes on the 
separate exercises on the T-M.* In each in- 
stance the result justifies the assertion that 
men and women score differently on these tests 
of masculinity-femininity—a finding that con- 
forms with the literature (3, 4, 5, 19). 

Correlations. Thus it has been established 
that both tests are measuring differences be- 
tween the sexes (Table 2). Yet the findings in 
the literature concerned with comparisons of 
other masculinity-femininity scales as well as 
observations made by the authors prior to this 
investigation led to the hypothesis that the 
T-M and the MMPI, although measuring some 
aspects of masculinity-femininity, do not mea- 

4Jn this analysis as well as those that follow, exer- 
cises 2 and 7 of the T-M are omitted since there was 


Imost no variability for either group. Attention is 
also called to the fact that these exercises are the least 


reliable of the T-M test (19, p. 60). 


Table 2 


Women on the T-M and MMPI 


Differences between Men and 

— 
i Means 7 
a T-M MMPI TM T-M; T-M; T-M; T-M. 

wan — 435 +5.18 +30.34 4-40.40 mn 
pat =a on 12.16 —4.54 41224 19.00 iy 
p eren qe ved 4 821 1972 —— 1840 —— $59.40 Me 
D +106.33 T a s.a pan sar 
n Es 6.04 8.03 5.29 11.38 a0 
P 14.90 —16. i 

All values are much less than .001 


340 Olga E. de Cillis and William D. Orbison 
Table 3 
Correlations between Tests of M-F; Correlations between MMPI and Subtests of T-M 
Group N rxy ruyt rxsy ray rxsy rxey 
Men 129 —.295***  — 034 .100 .214** 315*** Mo 
Women 50 —.365** +.134 —.305* — 08 —.213 M" 
Difference — —.070 +.168 —.205 +.106 +.102 =u 


* Significant at the 5% level of confidence. 

** Significant at the 1% level of confidence. 

*** Significant at the .1% level of confidence. 

t The subscripts refer to the sub-tests of the T-M. 
the total Mf scale of the MMPI. 

The confidence limits for 129 cases are: 5% level = 
the confidence limits are: 5% level = .279; 1% level = 


sure the same aspects to any great extent. 
This hypothesis has been verified by the low 
correlations of the present study (Table 3). 
For men the correlation between T-M and 
MMPI is —.30; for women it is —.37. Both 
correlations are negative, as should be expected 
in correlating raw scores, since a high T-M raw 
score means masculinity and a high MMPI 
raw score means femininity. 

The significance of the departures of the 
regressions from linearity was tested by the 
technique of analysis of variance (14, 255-62) 
for two regressions: the regression of MMPI 
scores on the total T-M for the male group, and 
the regression of MMPI scores on the total 
T-M scores for the female group. In neither 
group does the appropriate F ratio approach 
significance. For these groups, it can be con- 
cluded, there is no significant departure from 
linearity. The other regressions were judged 
by inspection to be linear. 

The correlations obtained in this study are 
comparable in magnitude with those cited in 
the literature for other masculinity-femininity 
scales. Nance (16) compared the 


es. masculinity- 
femininity scales of the Guilford-Martin 
Strong and MMPI. The correlations obtained 


averaged +.40 for men and +.2 

It may further be assumed th 
exercises comprising the T-M may them- 
selves measure somewhat different aspects 
of masculinity-femininity, Therefore, an at 
tempt should be made to determine wh t 
variables each of the separate exercises of th 
T-M has in common with t s 


he MMPI. Thi 
has been done by correlating exercises 1 3 T 
and 6 of the T-M with the MMPI (Table E 


1 for women. 
at the several 


+172; 1% level = .204; .01% level = .285. 


a - ated with 
Subtests 1, 3, 4, 5, and 6 of the T-M were correlated wit 


For 129 cases 
361. 


For women it is exercise three (information) 
which correlates most highly with the MMP 1 
For the male group exercises four (emotion 
and ethical response) and five (interests) 
yield the highest correlations. The corre? 
tions thus obtained have also been tested for 
significance by applying R. A. Fisher's z trans- 
formation (14, 123-24). The correlation 
which attain each level of confidence have "n 
appropriately indicated in Table 3. Only E 
results of exercises 3, 4, and 5 will be of uc 
in the following analysis since the correlatio a 
for the other exercises do not differ sign"! 
cantly from zero. b- 
The highest correlation for women, ° 
tained between exercise three (information 
and the MMPI, the authors are at a loss © 
explain, since the MMPI did not derive 49 
items from this exercise of the T-M (en 
4). This correlation, however, is signific 
at only the 5% level of confidence 4? 
difference from zero may have occurre 
Chance. For men, the two exercises 0? 
T-M that correlate most highly with ro 
MMPI are exercises four and five. A er 
exercise five is concerned, this finding 15 ar ete 
standable since the bulk of the items that "is 
used in the MMPI were derived fro" 


he 
5 


and Miles, and others are original.” * 


(Table 4). The following items may vet 
illustrate this agreement: (1) T-M, “AT® 


Comparison of Terman-Miles M-F Test and Mf Scale of MMPI 


Table 4 


Analysis by Subtests of Items on the T-M 
that Occur on the MMPI 


No. of Identical 
or Similar Items 


Exercise on T-M* on MMPI** 


4. Emotional and Ethical Response 8 
3. Interests 25 
6. Personalities and Opinions 2 
7 8 


- Introvertive Response 


“Exercises 1-3 inclusive on the T-M have been 
Omitted from this table since they did not contribute 
any items to the MMPI. . 

In this category are included either identical items 
that occurred in Form A of the T-M or items that were 
Similar in content but not in format that occurred in 
"orm B, 


feelings often badly hurt?"; MMPI, "My 
feelings are not easily hurt"; (2) T-M, "There is 
plenty of proof that life continues after death”; 
MMPI, “I believe in a life hereafter”; (3) T-M, 
Vere you ever fond of playing with snakes?"; 
MMPT, “I do not have a great fear of snakes.” 
Of the 43 items on the MMPI derived from 
the T-M, 25 are from exercise five (interests). 
On the basis of “common elements,” therefore, 
It is to be expected that the correlations with 
fxercise five would be highest. Referring to 
Table 3 it will be noted that this correlation is 
highest for the male group but not for the 
‘male group. However, the magnitudes of 
e Correlations for either sex are about the 
m. both are in the same direction, and they 
to not differ significantly when tested. 


Discussion 


war is evident that the T-M and the Mf 
tio © of the MMPI do not show a high moe 
Poe: The correlations are significantly dif- 
hi from zero, however, a result which does 
“Cate that the two tests are measuring some- 
Ne. in common. On the other hand the cor- 
om are low enough to suggest that the 
in apê? is not great—a finding susbtantiated 
wit correlational study made by Nance (16) 
Other tests of masculinity-femininity. 
Uch of what the two tests have in common 

€ ascribed to exercises four and five. 
Ma though the correlations between the a 
are not high and do not appear to De 


34 


measuring the same thing, the highly signifi- 
cant differences between men and women ob- 
tained would indicate that both tests are 
validly measuring some differentiae of the sexes. 
These findings are understandable if mascu- 
linity-femininity comprises not one but many 
dimensions. Accordingly, the tests used in 
this investigation would be measuring some- 
what different dimensions. The possibility of 
several masculinity-femininity factors has al- 
ready been indicated, although not conclu- 
sively, by Guilford and Guilford (7), and 
Martin (13). A factor analysis of the T-M 
test determining the common factors might 
prove more fruitful. Until some such deter- 
mination of what these scales are measuring, 
caution should be used in interpreting these 


results in counseling. 


Summary 


Fifty female and 129 male undergraduate 
students at the University of Connecticut 
were given both the Terman-Miles M-F Test 
and the Mf scale of the MMPI. Both tests 
clearly differentiated between the sexes but did 
not correlate very highly with each other. 
It is proposed that both tests are validly 
measuring different aspects of masculinity- 
femininity interests and attitudes. A factorial 
analysis of the Terman-Miles M-F test is sug- 
gested. Caution in using the scales in coun- 
seling situations is recommended. 


Received August 18, 1950. 
Early publication. 


References 


1. Benton, A. L. The Minnesota Multiphasic Per- 
sonality Inventory in clinical practice. J. nerv. 
ment. Dis., 1945, 102, 416-420. 

2. Brown, H. S. Similarities and differences in college 
opulations on the multiphasic. J. appl. Psy- 

chol., 1948, 32, 541-549. 

. Burger, F. E., Nemzek, C. L., and Vaughn, C. L. 
The relationship of certain factors to scores on 
the Terman-Miles Attitude-Interest Analysis 
Test. J. soc. Psychol., 1942, 16, 39-50. 

4, Disher, D. R. Regional differences in masculinity- 
femininity responses. J. soc. Psychol., 1942, 15, 
53-61. 

. Gilkinson, H. Masculine temperament and second- 
ary sex characteristics: a study of the relationship 
between psychological and physical measures of 
masculinity. Genet. Psychol. Monogr., 1937, 19, 


105-154. 


E 


tn 


342 


6. Gough, H. D. Diagnostic patterns on the Minne- 
sota Multiphasic Personality Inventory. J.clin. 
Psychol., 1946, 2, 23-37. 

7. Guilford, J. D., and Guilford, R. B. Personality 
factors S, E and M, and their measurement. 
J. Psychol., 1936, 2, 107-127. 

8. Hampton, P. J. The Minnesota Multiphasic In- 
ventory as a psychometric tool for diagnosing 
personality disorders among college students. 
J. soc. Psychol., 1947, 26, 99-108. 

9. Hathaway, S. R., and McKinley, J. C. Manual 
for the Minnesota Multiphasic Personality Inven- 
lory. Minneapolis: Univ. of Minnesota Press, 
1943. 

10. Heston, J. C. A comparison of four masculinity- 
femininity scales. Educ. and Psychol. Meas., 
1948, 8, 375-387. 

11. Kelly, E. L., Miles, C. C., and Terman, L. M. 
Ability to influence one's score on a typical 
pencil and paper test of personality. Character 
and Pers., 1936, 4, 206-215. 

12. Lough, O. M. Teachers college students and the 
Minnesota Multiphasic Personality Inventory. 
J. appl. Psychol., 1946, 30, pp. 241-247. 


13. 


14. 


15. 


16. 


M. 


18. 


19, 


20. 


Olga E. de Cillis and William D. Orbison 


Martin, H. G. The construction of the Guilford- 
Martin Inventory of Factors GAMIN. J. Psy- 
chol., 1936, 2, 107-127. 

McNemar, Q. Psychological statistics. New York: 
John Wiley and Sons, Inc., 1949, pp. 364. 

Mechl, P. E., and Hathaway, S. R. The K factor 
as a suppressor variable in the Minnesota Multi- 
phasic Personality Inventory. J. appl. Psy- 
chol., 1946, 30, 525-564. 

Nance, R. D. Masculinity-femininity in prospec- 
tive teachers. J. educ. Res., 1949, 42, 658-666. 

Schmidt, H. O. Test profiles as a diagnostic aid: 
The Minnesota Multiphasic Personality Inven- 
tory. J. appl. Psychol., 1945, 29, 115-131. 

Terman, L. M., and Miles, C. C. Manual of infor- 
mation and directions for use of Attitude-Interest 
Analysis Test. New York: McGraw-Hill Book 
Company, Inc., 1938. 

Terman, L. M., and Miles, C. C. 
sonality. New York: McGraw-Hill Co., 
pp. 600. «thé 

Verniaud, W. M. Occupational differences In i 
Minnesota Multiphasic Personality Inventory 
J. appl. Psychol., 1946, 30, 604-613. 


Sex and ptr 
1936, 


A Follow-Up Study on Satisfaction with Nursing 


Helen Nahm 
Duke University 


During the Spring of 1947 a study! of the 
Satisfaction of three groups of students was 
made at the Duke University School of Nurs- 
ing. The groups consisted of 70 seniors (class 
of 1947), 62 juniors (class of 1948), and 52 fresh- 
men (class of 1949). To measure satisfaction 
an adaptation of the Hoppock Job Satisfaction 

E Was used. To measure factors associ- 
dn ys satisfaction students were asked to 
Ae a number of questionnaire items 
NE i, to obtain information about their re- 

‘ons to the situation in the school of nursing. 
Boi of the study indicated that the 
nine io students, at the end of a period of 
aud M in the school, were an enthusiastic 
diae: aly motivated group. Junior and 
kss epi on the other hand, were much 
tensior 5 ied and showed many evidences of 
atter : and frustration. Responses of the 
a gen groups to questionnaire items indicated 
give are concern about the lack of time to 
study : SEO Y care to patients; as well as to 
recreatio "e and participate in social and 
Suggest; na activities. When asked to give 
recommend for improvement these students 

work Ps ed shorter and better-planned hours 
Staff “hd ne employment of a larger number of 

etter es so that patients might be given 
ts; and a counselor to assist students 
heir personal, social, and emotional 

^de mor They felt that courses should be 
€Valuatin, e Interesting, and that methods of 
experience ores of students in their learning 
improved. On hospital divisions should be 

ndinge *, o following the publication of 
made at t us study a number of changes were 
Ing, he Duke University School of Nurs- 
Xperience: Per week of class work and clinical 
ftom 4 : 9n hospital divisions were reduced 

! Na 9 44. A full-time counselor and a 
Peychol t Helen. Satisfaction with nursing. J. appl. 


48, 32, 335-343. 


9. 
OPpo 
Pers, peck, R. Job satisfaction. 


1935, New York: Har- 


343 


number of more highly qualified individuals 
were appointed to the faculty. Head nurses 
and supervisors were encouraged to take 
courses in ward management and teaching, and 
personnel work in schools of nursing. A more 
diversified program of social and recreational 
activities was provided. Every possible effort 
was made to create an environment in the 
nurses! residence conducive to the welfare 
and happiness of students. Faculty members 
worked with individual student leaders and 
with representative student groups to gain a 
better understanding of student needs and 
problems, as well as to give active assistance 
in planning for and making changes which 
seemed indicated. 


Follow-up Study 


To determine whether there is a relationship 
between environmental changes in a school of 
nursing and attitudes of students toward their 
experiences in the school, follow-up studies 
were made of the freshman students who 
participated in the original study (class of 
1919) and also of the group of students who 
entered the school of nursing during the fall of 
1947 (class of 1950). The Nursing Satisfac- 
tion Blank was administered to the two groups 
during the Spring of 1948, and again during 
the Spring of 1949. 

Changes in satisfaction of the class of 1949 
from the Spring of 1947 to the Spring of 1949 
are given in Table 1, and changes of the class 
of 1950 from the Spring of 1948 to the Spring of 
1949 in Table 2. Changes in mean scores and 
standard deviations for the two groups are 
given in Table 3. 

For the class of 1949 the correlations between 
satisfaction scores for successive years are as 
follows: (a) Between 1947 and 1948 scores 
=0.54; (b) Between 1948 and 1949 scores 
—0.53; and (c) Between 1947 and 1949 Scores 
—0.41. 

The differences between means of 1947 and 
1948 scores (/—5.59) and 1947 and 1949 Scores 


Helen Nahm 
344 
Table 1 — 
x Enrolled in the Class of E. 
Changes in Satisfaction with Nursing from 1947 to 1949 of Students Enrollec i ia a 
1949 mn 
c" ls , Per Cent 
Doesn't like it Indifferent Likes it Enthusiastic shail eee 
1947 z "T 
Enthusiastic 5 = an at 
Likes it 4 - ae 
2 2 Em 
Indifferent meg = vn 
Doesn't like it - men 
Total 0 9 25 11 45 100.0 
Per cent 0.0 20.0 55.6 24.4 
Table 2 oi 
" s mesai 
Changes in Satisfaction with Nursing from 1948 to 1949 of Students Enrolled in the Oae E pe - 
1949 
Cent 
> iati al Per 
Doesn’t like it Indifferent Likes it Enthusiastic Tota! 
1948 * i 
ç 
Enthusiastic 8 11 19 dd 
Likes it 13 5 a E 
Indifferent 1 2 Í Em 
Doesn’t like it mmm 
Total 0 1 23 16 di 100.0 
Per cent 0.0 2.5 60.0 37.5 
l 
eve 
Table 3 


Changes in Mean Scores and Standard Deviations on 
the Nursing Satisfaction Scale of Two Groups 
of Students of the Duke University 
School of Nursing 


Class of 1949 Class of 1950 


(N = 40) 
S.D. of S.D. of 
Mean Dist Mean Dist. 
Spring 1947 232 196 
Spring 1948 214 24 23.2 2418 
- Spring 1949 218 244 230 1.82 


] 
cent e 
(t=3.80) are significant at the LA m 
(in the direction of less satista! giu af 
difference between means e signifi? 
1949 scores is not statistically a 
(t=1.30). son bee 
For the class of 1950 the correlate” diffe 
1948 and 1949 mean scores nent 
ence between means is not sta A 
e A en + pl 
cant. From these findings it um a si 
that, for the class of 1949, ws dnm he m 
cant decrease in satisfaction wo inc ge 
the second year, and no sign Gar. of. 
from the second to the third e gatis 
class of 1950 the slight decrease 


Follow-Up Study on Satisfaction with Nursing 


from the first to the second year is not statisti- 
cally significant. 


Factors Associated With Satisfaction 


Differences in percentages of students of the 
class of 1949 responding to various question- 
naire items designed to discover factors associ- 
ated with satisfaction in nursing indicated the 
Changes which had taken place since the time 
9f admission to the school. Percentage differ- 
ences (1947 and 1949) which are significant 
at the 1 per cent level indicate that students: 


were more likely to enjoy working with 
doctors and to feel that doctors approved 
of their work; 

More often had opportunities to express their 
ideas on hospital divisions; 

were less likely to enjoy living in the nurses’ 
residence; and 

Were less likely to enjoy formal classes. 

Percent, 


age differences which are significant 
at the 


5 per cent level indicate that students: 


More often had opportunities to use their 

^ initiative on hospital divisions; . 
Cre less likely to feel that the teaching pro- 
gram on hospital divisions was adequate; 
and 

Were less likely to feel they lacked social 


Skills needed to feel at ease in social 
Situations, 


Attitude Changes 


fot the class of 1949 (admitted in September 

tude qid Statements which indicate atti- 

the nges which took place after admission 
School are given as follows: 


I 


li ; 
- Dursing better, now that I have had 
I jfi experience, 
Ser a better understanding of the profes- 
ee] and its responsibilities. 
si More self-assurance when I care for 
3 ck People, 
n . 
The mom patients and others better. 
‘rill has worn off. 
1 H 
he) the many things a nurse needs to 
1 in order to be proficient. 
ab, wore realistic and less sentimental 
I out nursin 
have lie g. : 
Come more hardened to illness. 


345 


For the class of 1950 (admitted in September 
1947) typical statements which indicate atti- 
tude changes are given as follows: 


I am more interested and enthusiastic than 
during the first year. 

Iam less idealistic than when I entered. 

I have developed greater understanding of 
people. 

I derive deep satisfaction from the work. 

I didn't realize it involved so much work and 
study. 

I have matured during the past year. 

I have gained more self-assurance. 

Iam better able to accept responsibility. 

Ilike it, but am not as enthusiastic as when 
T entered. 


Eighty-five per cent of the class of 1949 and 
87 per cent of the class of 1950 said that, if it 
were possible to go back a few years, they 
would again enter the Duke University School . 
of Nursing. The remainder of the students 
either were undecided, or said they would 
probably enter a school nearer home. 


Future Plans of Students 


Eighty-one per cent of the class of 1949 said 
they would prefer institutional nursing after 
graduation. Fifty-two per cent said they 
would like to do general staff nursing and the 
remainder either teaching, administration, or 
supervision. Sixty per cent felt they would 
need additional preparation in some field of 
nursing after graduation. Forty per cent 
believed that experience as a general staff 
nurse was all that was needed. 

When asked what they would like to be 
doing ten years from now, 90 per cent of the 
class of 1949 said they would like to be married. 
Only 13 per cent of the group would plan to 
continue in some field of nursing after marriage, 


Summary 


The fact that, for the class of 1949, there was 
a significant decrease in satisfaction from the 
first to the second year without a corresponding 
increase from the second to the third year 
would seem to indicate that, once students 
have lost their initial enthusiasm for nursing 
it is not easily regained. On the other hand, 


346 


the fact that, for the class of 1950, there was 
no significant decrease in satisfaction. with 
nursing from the first to the second year would 
indicate that it is possible for students to retain 
their initial enthusiasm for nursing. 
Responses of students of the class of 1950 as 
to their attitude changes following admission 
to the school seem, in general, more favorable 
than those of the class of 1949. They indicate 
that some of the concomitants of a satisfactory 
nursing school program are greater under- 
standing of self and others, deep personal 
satisfaction from the work itself, and a matur- 
ing of the entire personality structure. Re- 
sponses of both groups indicate that students 
become less idealistic about nursing and more 
realistic as they progress through the school. 
This is probably both inevitable and desirable. 
The statements which students make as to 
their attitude changes indicate that only a few 
become actively aware of the great public need 
for the future contribution which they may 
make. The fact that only a small minority 
hope to continue in nursing after marriage 
would tend to support this conclusion. 


Helen Nahm 


From both initial and follow-up studies of 
satisfaction of students of the Duke University 
School of Nursing the general conclusion may 
be drawn that there is an association between 
changes in the total environmental situation 
and the extent of satisfaction with nursing 
It seems likely, however, that a period of from 
two to three years is required to change atti- 
tudes which are primarily negative to those 
which are predominantly positive. _ Student 
who have been very dissatisfied continue to be 
suspicious of the motives of individuals Ea 
sponsible for a school, even though these ind! 
viduals make changes which are much arn 
When the more advanced groups of stide : 
in a school are dissatisfied and unhappy, © 
attitudes of less advanced groups are inevitab 
affected after a period of time. However, k 
the satisfaction of each successive 6m 
students increases, the morale of the en = 
student group undergoes a gradual een 
Suspicion gives way to confidence and pee 
and an atmosphere in which each student m? 
grów and develop is finally created. 


Received December 12, 1949. 


Attitudes of Veterans toward Vocational Guidance Services 


Frederick J. Gaudet, A. Ralph Carli and Leland S. Dennegar 


Stevens Institute of Technology 


The problem of the evaluation of guidance 
Services is in the forefront of educational and 
Psychological thinking today. Certainly, at 
no time in the past has so much money been 
Spent on educational and vocational guidance 
Dor at any time have so many highly trained 
Psychologists and guidance counselors been 
employed in this field, 
of (i he attempts to get a picture of the efficacy 

l MS vast Program have been very limited. 
ie has been only one published study which 
the Ve control group. The Central Office of 
ds js; S Administration (VA), however, 
Surva ui. files the results of many attitudinal 
io d ui. Veterans who have received voca- 

nal guidance in centers under its auspices. 
one of them, however, makes use of an ade- 
quate Sampling? 
Same eut Study was conducted during this 
cholo a by the Service Center of the Psy- 
With i Corporation. This study dealt 
its qua non-veteran population and many of 
ete DES have been incorporated in the 
ased nnaire on which the present study is 
Purpose hese items were included for the 
Veteran of comparing veteran with non- 
a Quas s Ance Services in an effort to answer 
the e 9^ posed by many psychologists as to 
Doubts td of the VA guidance program. 
Progran, ave been raised as to the value of a 
Was a which was so huge that its inception 
limitation t an emergency and for which the 
striction of funds per man implied a certain 
9 cou] 9n the qualifications of counselors 
h uld be employed. 
Seleq 1 Psychological Corporation had coun- 
Concern eh employees in a large industrial 
‘De any of whom would be transferred 
spon contin Os and Reeves, P, Effects of advisement 
1,1948, g7 lon in training under P.L. 346. Sch. & 

Gaudet, 129-431. . 

‘J. The Veterans Administration Ad- 
Guidance Program, Sch. & Soc., 1949, 


de 
i of oe Rose G. Reported and demonstrated 
(38, 460-47 ral counseling. J. appl. Psychol., 


or released at the termination of hostilities, 
An anonymous questionnaire was circulated to 
all of these employees asking for their evalua- 
tion of, or satisfaction with, the guidance 
process. Replies were received from 685 men 
and women. 

The present study was based on the results 
of a questionnaire sent to 200 veterans who 
had received counseling at the VA Guidance 
Center at Stevens Institute of Technology. 
The survey was designed to get a picture of the 
attitudes of veterans toward the guidance pro- 
gram as a whole, and also to get opinions on 
specific phases of the procedure with a view 
toward improving the services. 

The subjects were selected at random from 
the files of those who had received guidance 
between September 29, 1945 and February 16, 
1946. All of these veterans had visited the 
center at least one year prior to the time the 
survey was started. The questionnaire, along 
with a letter explaining the purpose of the 
study, was mailed to these 200 men, of whom 
81 answered. Those who did not answer the 
first letter were sent a second which produced 
an additional 51 responses, and a member of 
the staff interviewed those who did not answer 
either letter. Altogether 164 replies were 
secured. The remaining 36 did not answer the 
questionnaire for the following reasons: (1) 
Address changed, no forwarding address, 23; 
(2) Living outside of New Jersey, 5; (3) Re- 
enlisted, 2; (4) In VA hospital, 2; (5) Non-ex- 
istent address, 1; (6) Died, 1; (7) Could not re- 
call his impressions, 1; and (8) Refused to 
answer, 1. 

With the exception of the man who refused 
to answer, and perhaps the one whose recol- 
lection was poor, it is apparent that probably 
no selective factor operated to distort the 
answers. 

The following questionnaire was used: 


Authorization No. . 
(Draw a circle around the word which is your 
answer to each of the questions on this page) 


347 


348 


1.* As a result of your visit to this center did 
you get a better idea 
a. of your strongest 


abilitiestiic sse yes no doubtful 
. of your less strong 
a abilities? MEPE yes no doubtful 


b. in general?........ yes 


ceived here 
a. increase your self- 


confidence?....... yes no doubtful 
b. decrease your self- 
confidence?....... yes no doubtful 


4.* Did the guidance and counseling on the 
whole give you a better understanding of 
XourselE?., a sirpa o entes yes no doubtful 

5.* Do you feel that your guidance and coun- 
seling was a worthwhile experience? 

yes no doubtful 

6.* Would you recommend a similar counsel- 

ing center for non-veterans at their own 


E, yes no doubtful 


7. On which floor were you treated best? 


ist 2nd 3rd 
8. On which floor were you treated least 
satisfactorily?......... Ist 2nd 3rd 
9. ne part of our job do you think we do 
est? 


10. What part of our job do you think we do 
poorest? 


11. We improve our work through ism 
and suggestions. What can you suggest 


that we can do to change or improve our 
services? 


ce and 


12.* At what age do you think guidan 
counseling should be provided? 


The first five items were designed to deter- 
mine the effect of counseling on the veteran’s 
view of himself. Item 6 was directed toward 
obtaining a general reaction to the counseling 
service. As a basis for self-criticism and an 
aid in improving Stevens service, items 7 
through 10 provided an opportunity for evalua- 
tion of specific aspects of the program; con- 
crete suggestions for improvement were re- 
quested in item 11. Item 12 solicited the 
veterans’ opinions regarding the persistent 

* Items marked with an 


Psychological Corpor: 
modified slightly. 


asterisk were taken from the 
ation questionnaire, Some were 


F. J. Gaudet, A. R. Carli, and L. S. Dennegar 


” 
question “When should one receive eau 
Of course, the correct answer is continuous y, 
but until our schools and colleges are able a 
furnish educational and vocational guidance, 
this is merely a theoretical answer. m 
The ee age of the group was 25 ram. 
Unfortunately, data on educational levels » a 
not available for these men but à study o 
similar sampling of 200 men who received p 
seling during the same period Mime, a 
the median of the highest grade completec i ^ 
11 with lower and upper quartiles grauen s 
respectively. " 4 
athach responses were received boma 
veterans, not all answered every que: uebi 
This failure to answer was particularly rA anc 
in b of items 1, 2, and 3, and in items d án 
10, in which the veteran thought he ha arts 
swered the query in previous questions, ud " 
of questions, or presumably had nothing 
atory to say. e 
The wc to the first six IDE 7 E 
questionnaire are indicated in Tubi Cor 
general, the results of the Psychologii com- 
poration and Stevens studies are fary jnter- 
parable. It should be remembered I Cor- 
preting these data that the Psychologie cent 
poration survey was based on à 5 f those 
return and that no study was made 0 
who did not answer. The Stevens 1 
based on a return of 82 per cent, and alo cspon 
which indicated that those who did not ¥ 
were probably not a selected group, 
The results for part a of item 1 oret s 
the two centers were similar in acqua? pow? 
individual with his assets, but part 


a 
man's handicaps or liabilities as ie this emi 
Psychological Corporation. Whether irable p. 
phasis on “less strong abilities" is desia nce 
not is a question of the philosophy of & ive 
namely, whether guidance shoul an 
client a picture of both his assets E asse 
liabilities, or whether it should Stc? ue 
Of course, both should be evaluate nj 
counselor, and the client should be Ne 
liabilities if he mentions education® 
tional objectives in which he woul s$ 
capped. . that le 

The responses to item 2b idi im^ ee 
than one-third (30%) had under. int? 
their abilities in general. However , 


Attitudes of Veterans toward Vocalional Guidance Services 


esting to observe that considerably more (44%) 
had under-estimated their aptitudes and abili- 
tles for particular jobs. Again, the responses 
in the studies of the Psychological Corporation 
and Stevens seem to agree. 
_ Probably as important as obtaining a clear 
Picture of his aptitudes and abilities, is the 
| effect of guidance on the individual's confidence 
himself. The answers to item 3a indicate 
that 73 per cent of the veterans increased their 
self-confidence as a result of going through 
x guidance, while only four per cent stated that 
their general self-confidence was lowered. It 
'S probable that a good guidance procedure 
Would increase the self-confidence of most indi- 
viduals but decrease that of a few whose qualifi- 
“ations for particular jobs are questionable. 
Vhether the “right” ones had their self-confi- 
a Increased or decreased, these data do not 
Elis US. The differences between the two 
: ips, veteran and non-veteran, are not 
Significant, 
T p answers to item 4 were intended to give 
oest “all picture of what the guidance process 
realists, an individual in helping him sce himself 
o the cally. The results of the questionnaires 
three. two centers are again similar; about 
Seni a tets felt they had a better under- 
8 of themselves. 
i Sis probably the key question in the 
LHP M UE ni . The objective of an 
ot ira and vocational guidance center is 
Understa "vd 1o give the individual “a better 
him an n dem of himself (item 4), but to give 
is eden erstanding of himself in relation to 
ual Ni aa and vocational future. This 
Sidera m is probably reflected in the con- 
‘Yes "d s higher percentage of (95 per cent) 
Cent) to ee to item 5 than to those (74 per 
Answers ge 4. The percentages of “yes 
an Sych be two questionnaires, Stevens 
'dentica], ological Corporation, are almost 


ES 6 constitutes another method, a less 
What SIE of getting an over-all evaluation of 
p togram veterans thought of the guidance 
been th It is possible that someone who had 

rough guidance would consider it de- 
°t himself but would not think it 
sv... Necessary for others. In both 
Studies” ological Corporation and Stevens 

' 71? Services were free. The question 


349 


asked was whether they would recommend 
it for others who would have to pay. It will 
be observed that in both studies there is a 
decrease in the percentage of "yes" answers 
as compared to item 5, but the difference is not 
large. 

In comparing the two studies it should be 
noted that the Psychological Corporation 
study was based on a population who went 
through the guidance process voluntarily. 
The Stevens study included some veterans (22 
who came to the guidance center because they- 
asked for this service under Public Law (PL) 
346. The majority, however, were PL 16 
cases, disabled veterans (142) who were com- 
pelled to go to a VA center before they were 
permitted to take training under the VA. Of 
course, it is not implied that all disabled vet- 
erans came under compulsion. In fact, their 
disabilities may have caused them to want 
guidance more than the average individual. 

Items 7 and 8 were used in the questionnaire 
in an attempt to “fractionate” the favorable or 
unfavorable attitudes shown by the veterans. 
The first floor of the Stevens Guidance Center 
building was staffed by VA personnel; coun- 
seling took place on the second and third 
floors, and testing was done on the third floor. 
One interesting feature of the replies is that 
although the questionnaire did not offer an 
opportunity to answer "all" to either of these 
questions, 25 per cent wrote in this answer to 
item 7, and only two per cent to item 8. Dis- 
content was more frequently located on the 


Table 1 


Reaction of Veterans to Vocational Guidance 


Per Cent Answering: 


= 
Item Yes* No Doubtful Rey 
T 76 (76) 13 9 2 
1b 43 (69) 9 18 33 
2a 44 (51) 34 13 9 
2b 30 (27) 35 H 23 
3a 73 (80) 12 13 2 
3b 4 (6) 60 9 27 
4 74 (80) 13 13 1 

5 91 (90) 4 5 1 
6 84 (71) 7 8 1 


* Percentages in parentheses are those obtained by 
the Psychological Corporation in its questionnaire. 


350 


first floor as indicated"in the low (5 per cent) 
percentage of “yes” answers to this floor in re- 
sponse to item 7 and the high (28 per cent) 
percentage of “yes” to item 8. To determine 
whether the least favorable attitude toward 
the first floor in response to items 7 and 8 had 
influenced the feeling toward the whole guid- 
ance procedure, answers to items 7 and 8 were 
correlated with items 4 and 5. The results in- 
dicated that there was no inter-relationship.* 
Items 9 and 10 were included in the ques- 
tionnaire to evaluate various parts of the 
guidance process. It is interesting that the 
most frequent answer to both questions was 
“all good"—38 per cent to item 9 and 25 per 
cent to item 10. The next most frequent 
favorable answer to item 9 (29 per cent) and 
the most frequent unfavorable answer to item 
10 (14 per cent) was counseling. Since coun- 
seling might be considered the climax, or the 
part of the guidance program requiring the 
greatest professional skill, the frequency of 
these answers is noteworthy in that they were 
able to recognize its importance to them. The 
other answers to item 9 in order of frequency 
were: testing, “no reply,” interviewing, amount 
of time devoted to the guidance process, and 
VA representatives. The other dissatisfac- 
tions expressed in answers to item 10 in order 
of frequency were: “no reply,” amount of time 
devoted to the counseling, testing, VA repre- 
sentatives, lack of placement aids, 
good,” and location of the center, 
Cues as to the above favorable and un- 
favorable evaluations are indicated by the re- 
sponses to item 11. A tabulation of these re- 
sponses is presented in Table 2, Certainly 
these responses give the impression that the 
respondents had taken the questionnaire 
seriously. No comments appeared to be 
facetious. The chief value of the comments 
has been to make the counselors more keenly 
aware of their need to learn what the counselee 
seeks from guidance. Analysis of the com- 


* A later study of a new group (200) of veterans who 
went through the guidance center when it was staffed 
with different VA personnel indicated that the answ 

to items 7 and 8 were not due to VA procedures linkin 
the personnel carrying out these procedures The 
answers to these items for “first floor” were 23 pé en 
and six per cent, respectively, in the Second study. 


"nothing 


F. J. Gaudet, A. R. Carli, and L. S. Dennegar 


Table 2 
Veterans! Suggestions for Improving the 
Guidance Process 


— 
N a 


li 1 
1. Increase length of process: more counseling (1 N 
toe H 01 
more tests (7), more school information (4), m : 
S A owe veter- 
job information (3), more information on 


26 
ans’ benefits (1)........ pennor si BA PEAS se 
2. Decrease length of process: shorter program (17), A 
fewer tests (7). ........ 3d vom aras sm t 2 
3. Do placement work: find employment (2), follow " 
up to see that veterans get jobs (8).-- +> 1 
4. Improve scheduling*... . — 5 
5. Prove validity of tests. . . 2 n MMC 
6. Improve introduction to guidance... --- ji ike 
7. Improve personal interest in counsclee: t4 N^ 
more interest (3), less interest (D)... «7 à G ü 3 
8. Improve personnel: generally (2), no women (H 5 
9. Improve everything. i... sns 2 
10. Improve location. ..... se yashna r i 1 
DY, MUNG rs € iim. 
7 - Se ~ centers 
* At the time these veterans came to the = be- 


veterans frequently had to wait five or SIX hist 
tween asking for guidance and the first appo etera 
} Each of these was mentioned by only ens ^ give 
They ranged from “advertise services offered ted." 
veterans a letter of acceptance to school sugges! 

p did 
ments according to the counselor ec 
not indicate that the suggestions T¢ e 
one more than another. da wide 

The answers to item 12 covere ungeling 
range. The median age for which co 
was considered desirable was 18 years- 


Summary 


to 
The present study was undertake? iga 
termine what attitudes toward the » ho pad 
process were held by 164 veterans reint i 
received counseling at Stevens Institu data a” 
and 1946. In comparison with he Qoi 
tained from a non-veteran group " 


a 
ts ne 

r The Stevens study suse one a 

spite of limitations in time and fr 


fited faal 
veterans believed that they had pT? catio” 
the VA program of educational an 
guidance. 
Received July 24, 1950. 

Early publication. 


Upper versus Lower Case Copy as a Factor in Typesetting Speed 
for Linotype Trainees * 


Bernard Stern 


State University of Iowa 


The steadily increasing cost of labor in the 
newspaper industry has resulted in a continuing 
Search for means and methods to bolster the 
Productivity of newspaper employes. Especi- 
ally is this true of mechanical employes. These 
Workers, who belong to long established unions, 
SIG frequently paid more highly than editorial 
Workers, 

Thus, this writer was interested in noting ina 
eid How to make type readable, the following 

ement: “Material set in all capitals is read 
;^ Per cent more slowly than material set 
wh "mas Case. Reader preferences are over- 
eimingly in favor of lower case." 

gether with that claim by Paterson and 

er was this statement: “It is apparent that 

wary text retards speed of reading to a 

Sing degree, |, , Few typographical factors 


Car d ^ a 
t n be found which will retard reading to 
Ms extent," 


S If these 
linotype op 
Occurred to 
means of bo 
daily read 
Wire Copy 


claims held true for newspaper 
erators as well as for readers, it 
this writer that here lay a fertile 
osting productivity. Such workers 
and set in type millions of ems from 
which comes printed in all-capitals 
Tite of teletype machines in news- 
tion to ris throughout the country. In addi- 
Mately MS, these operators also set approxi- 
Writ en bis much type from reporters type- 
E Ben Which is usually set in lower case. 
In the 5 pe stating it more succinctly, is, 
Can am of Paterson and Tinker's finding, 
in coi Pe operators set more composition 

Parable amount of time from wire 
deste Would like to express his appreciation 


Party, | Professor Clayton d’A. Gerken, Psychology 
Hym ; WiyPrefessor Leslie G. Moeller, Marshall Ñ. 


an 
9 Touma tam J. Morrison and Henry Africa, School 
ly Paters m, State University of Iowa. 
p. o eadaj | D. G. and Tinker, M. A. How to make 


Mi : New York: Harper & Brothers, 1940, 
Times ebtainabe from the authors University of 


E 23. 


copy (upper case) or from typewritten copy 
(lower case)? 

If linotype operators could set an appreciably 
greater amount of copy from lower case copy 
rather than from upper case copy in a compara- 
ble amount of time, perhaps it would behoove 
the news wire services (Associated Press, 
United Press, International News Service) to 
make an adjustment in their teletype machines 
so that news stories could arrive in newspaper 
offices printed in lower case (caps and lower 
case). Or if this were not possible, it seemed 
likely that publishers could make a savings by 
employing typists to rewrite the all-capital 
wire copy into lower case. 

On the other hand, if linotype operators 
could set more composition from upper case 
wire copy, would it not be worthwhile to have 
reporters turn in their stories typed in all- 
capitals? 

An ideal means of testing this proposition 
with linotype operator trainees was available 
to the writer in the Newspaper Production 
Laboratory of the University's School of 
Journalism. Each semester the laboratory 
trains 12 to 15 student linotype operators. 

It should be emphasized here that the oper- 
ators participating in this experiment were 
beginners. At the time this study was con- 
ducted they were capable of turning out an 
average of one-half galley of composition an 
hour compared to experienced newspaper 
linotype operators who can set more than twice 
this amount per hour. 

The findings set forth in this study, therefore, 
pertain solely to beginning linotype operators, 
No attempt is made to project the results to 
the work of experienced operators. 


Procedure 


This experiment was carried out after the 
student linotype operators had spent about 
twelve weeks in the Newspaper Production 


351 


352 Bernard 
Laboratory learning hów to operate the lino- 
type. Twelve students were used. This group 
was divided in two equal sections. 

For this study, the author collected 130 news 
stories which had come over the Associated 
Press teletype in The Daily Iowan, student- 
published daily newspaper. These stories were 
cut into page lengths of 814 inches by 11 inches 
and were retyped in lower case. Every effort 
was made to get the same number of lines per 
page and the same number of words per line in 
the typewritten version as was in the wire (all- 
capital) story. 

The 130 stories contained approximately 
25,000 words and took up 110 pages. The 
various news stories varied in length from 60 
words to 800 words and included the following 
varieties of news: international, national, 
crime and accidents, financial, reports of 
speeches and meetings, weather and sports. 
A breakdown in the sports category showed it 
included stories about baseball, golf, tennis, 
trapshooting, swimming and horse racing. 
Box scores and summaries of sports results 
usually set in agate type were omitted from the 


Stern 


Before the operators began work they were 
cautioned to set the stories exactly as they 1 
peared on the copy. In the wire or all-capt E 
version of the story the capital letter was d 
derlined so the linotype operator would i 
that letter in a capital. Otherwise there X 
no difference between the two types of materia i 

Each of the twelve operators was given ds 
opportunity to set the same stories from B 
two types of copy, upper case and lower AL 
If during the first week he worked on the a d 
capitals copy, the second week he set 1 a 
the same stories which had been sale Re 
lower case. If the operator had workec in 
lower case copy the first week, the parte E: 
week he was given the same stories to set 
the upper case version. : t 

A record was kept of the number of ain Fi 
daily by each operator from upper a ee 
from lower case copy. At the end te rei 
week these scores were totaled. The e 
for the significance of the difference be 
the two groups of scores was then made. 


a57 
ed me? 
? The “P? test employed was that for relat 


i i le. em 4 i ris jantitative Mer 
copy set by the linotype trainees. Each story orem vx —€— rond Brothers, I^" 
was given a "slug" or tag to identify it. pp. mie ENS 

Table 1 
Operator’s Weekly Average Speed, Lines per Minute and Weekly Average Errors, Errors per Line 
Weekly Average Speed Weekly Average Errors 
Operator's Upper Case Lower Case i Lower Case 
Number Copy Copy Copy 
(1) (2) (3) EM m 
2 1.50 129 045 
3 1.28 1.74 021 
4 1.50 144 043 
E 1.13 1.20 034 
6 2.10 1.76 062 
7 1.25 1.54 024 
8 1.35 157 .040 
9 132 140 490 
10 1.48 1.47 .058 
11 144 1.35 .059 
5 1.25 153 .038 
14 2.16 1.97 054 
a i -— 
Mean 148 = TN 056 
E 1.52 S 
S.D. d s 055 os 
Mean Difference 04 : 028 
S.E. Difference 25 pn 
t 58 d 


Upper vs. Lower Case Copy as a Factor in Typesetting S peed 


Three linotype machines were available in 
the Newspaper Production Laboratory for the 
experiment. So that no one operator should 

ave the advantage of working on a machine 
which might be easier to operate, the operators 
Were rotated on the machines. 

It was believed that as a corollary to deter- 
mining whether linotype operators could set 
More composition from lower case or from 
upper case copy, it would be important to dis- 
Cover if the operators made more errors in 
setting upper case material than in setting 
lower case copy. 

Before beginning the experiment, a pilot 
lest was run, This was done with a different 
Stoup of twelve linotype operators to see 
Whether or not it would be possible to control 
t € necessary factors. Another purpose was to 
Ind out if any "bugs" would arise to hamper 
the project. 
one results of the pilot test were satisfactory 

Proved to be similar to those obtained in 
* experiment itself. 


Results 


S s 1, columns 2 and 3, shows the mean 
operat, Scores made by each of the twelve 
"y. OTS on the two kinds of copy. 
Ubper mean speed of the 12 operators on the 
While ^56 Copy was 1.48 lines per minute, 
“on lower case copy, the mean speed was 
Perforns S per minute. "The range of operators’ 
minute ances varied from 1.13 to 2.16 lines per 
1.97 a upper case copy, and from 1.20 to 
Table e minute on lower case copy. 

“or score. columns 4 and 5, shows the mean 
ators ae made by each of the twelve oper- 
Mea, Pis n case and lower case copy. The 
Copy ares of all operators on upper case 
“rors pe ae errors per line compared to .056 
Tan fo ine on lower case copy. The range 
“se ang "s O to .071 errors per line on upper 
Tom .021 to .190 on lower case copy. 

Ence, „2 bottom of Table 1, the mean differ- 
differ : fandard deviation, standard error of 
ee and the *'/? test values are shown. 

be Co k S decided that a difference would not 
was ch ‘ted significant unless the difference 
gni ca m the five percent level. To be 
test y, LE at the five percent level the "t 

© Would have had to be 2.20. 


353 


No significant difference was shown in set- 
ting from upper case copy as compared to 
lower case copy. This was true for both speed 
and errors. 

Conclusions drawn from this experiment are 
not presented as definitive proof that the 
findings apply to all operators. The students 
at the time they began this experiment had 
had twelve weeks of training during which 
period they had practiced an average of ten 
hours a week on the linotype. Only to this 
extent, can the findings of the experiment be 
applied generally and then they can be gen- 
eralized for trainees only. 

Although no attempt is made to project 
these findings, it may be helpful to know, 
simply as a frame of reference, what is expected 
of an experienced newspaper linotype operator, 
Three persons in the field with long experience, 
when consulted, agreed that an experienced 
operator could set between 1,600 and 1,700 
lines of type a day and not make more than 
six errors per galley. 

The speed of composition of an operator 
setting 1,600 lines would be 3.3 lines per minute. 
Figuring that there are 170 lines of type to a 
galley an operator making six errors per 
galley would have an error of .035. 

The average speed of all the linotype 
trainees on both types of copy (upper case and 
lower case) was 1.50 lines per minute. The 
average errors made by the twelve trainees on 
both types of copy was .0555. Thus it appears 
that the linotype operators participating in 
this experiment were, as a group, slightly less 
than half as fast and made nearly twice as 
many errors as an experienced newspaper 
linotype operator. 

In concluding, it may be well to mention 
that linotype operators, by the very nature of 
their job, must read carefully every letter and 
symbol of the material they set in type. Asa 
result their reading rate is slowed considerably 
since their job motivates them to stress ac- 
curacy in reading copy, rather than Speed or 
comprehension. It is possible that this may 
account in large part for the lack of significant 
difference between the setting of the two differ- 
ent types of copy. Conversations with several 

4The figure 170 lines to the galley is based on the 


assumption the operator is setting 8-point type on 
8}-point slug and the slug is 12 ems wide. aR ap 


354 


linotype operators indicated that in setting 
ordinary material, they do not read it letter 
by letter unless such words are foreign to them, 
or such words are technical, complicated or 
names or the spelling is different than that to 
which they are accustomed. 

As was stated previously, in setting copy ex- 
perienced operators are required to set a 
certain amount of lines per day with a certain 
minimum of errors. Thus, in setting type 
there is a premium on both speed and accuracy. 
It may be added that it costs approximately 
fifteen cents per line to make a correction. 

After the experiment was concluded, the 
participants were interviewed individually in 
an effort to discover which of the two types of 
copy they preferred, upper case or lower case. 


Bernard Stern 


Eight of the twelve unhesitatingly said they 
preferred lower case copy. They claimed it 
was easier to read and marking individual 
letters for capitalization sometimes was Con- 
fusing. Two of the twelve were just as un- 
hesitating in stating a preference for upper case. 
They said it was easier to read because the 
large cap letter was more easily perceived. 
The remaining two said they had no preference. 

In matching the individual’s record against 
his performance, it was ascertained that the 
attitude of the operator had little influence on 
what he did. In practically all of the cases 
the operators sometimes would set the upper 


case copy faster one day and the lower case 
the next. 


Received November 14, 19-19. 


Design Complexity as a Determiner of Visual Attention Among 
Artists and Non-Artists * 


Walter A. Woods and 


James C. Boudreau 


Pratt Institute, Brooklyn, New York 


Observations by artists and art critics are 
frequently concerned with the need to train 
the artist to “see.” Zadkin (8) speaks of 
teaching the pupil to see. Ensor (3) distin- 
guishes between the practiced eye and the 

More common eye." Roger Fry (4) com- 
ments: “Now this specialization of vision goes 
So far that ordinary people have almost no idea 
of what things really look like . . . the moment 
? artist who has looked at nature brings to 
Peng a clear report of something definitely 

Jy him, they are wildly indignant at its 
Untruth to nature.” 
aa observations are indicative ofa rather 
~ SPread belief that the artist sees “better 

. in à manner different from the non-artist. 
pariminary attempts to investigate the visual 
ee of artists and non-artists are briefly 
etn ed by Brandt (2) and Buswell (3). As 
ine evidence has come to light which has a 

caring on or which enables us to arrive 
aly general understanding of the problem. 
determ; P esent experiment was designed to 

i ewer whether it was possible to discover 

ices in the "seeing" processes of artists 
Non-artists, and whether in fact, differences 
Barded i. This report, which should i ie 
that it ig “i oe pem — 
in Visual ssible to measure sensory di x 
a Btoup Seid and that artists do differ, as 
» trom non-artists. 


Selection of a Visual Stimulus 


to k determination of the stimulus elements 
tists ized stems from the hypothesis that 
ace tend to respond differently from 
that thes to designs of varying complexity; 
onarty artist will be more sensitive and the 
This p ‘St less sensitive to complex designs. 
the me POthesis is derived theoretically from 
+, C tablished design concept of unity 1n 
Phot 


e mail 
aig tograpp hors are indebted to Walter Civardi for 
In Selects Consultation and Eugen H. Petersen tor 
ing and executing the designs. aie 


variety; that is, a satisfying work of art must 
at the same time be sufficiently unified so that 
attention will be centered on the design as a 
whole, yet sufficiently diverse so that interest is 
maintained. (See Graves (5) for a discussion 
of design principles.) Presumably the artist 
who is able to successfully execute a design 
which meets these criteria has command of a 
visual language of higher complexity than the 
non-artist who is unable to execute satisfying 
designs. 

That artists differ from non-artists in their 
grasp of complexities of design units (and 
among themselves as artists) is suggested by 
the statement attributed to Picasso (6): 
“Cubism is no different from any other school 
of painting. The same principle and the same 
elements are common to all. The fact that 
for a long time cubism has not been understood 
and that even today there are people who can- 
not see anything in it means nothing. I do 
not read English, an English book is a blank 
book to me. This does not mean that the 
English language does not exist. . . .” 

This statement illustrates one phase of our 
problem and at the same time raises another; 
what are the elements of design which are 
representative of "varying complexity"? Is 
color an element? Or form or Space or subject 
matter? We are aware of the frequent 
criticism leveled at abstract art that it is 
“confusing” and meaningless; or the less fre- 
quent, but important criticism that the colors 
dominate the design and thus the meaning must 
be found in the "emotional" stimulus of the 
colors. It is generally understood that ab- 
stract art has some meaning (frequently be- 
lieved to be a hidden meaning) and that this 
meaning (or the understanding of it) may 
depend upon the visual sophistication of the 
observer. According to some critics (and the 
above comment of Picasso bears this out) 
abstract designs do have symbolic meaning 
apart from their purely sensory qualities, 


356 


Thus, an abstract design, as such, is not an 
adequate stimulus for our study since E 
grasp of symbolic content might enhance the 
attention value for the artistically sophisti- 
cated, regardless of sensory qualities. 

'The same criticism may be directed to color 
asacriterion. Colors are generally considered 
to have symbolic meaning (see Birren [1]) in 
addition to their stimulus value. Past ex- 
periences and associations may determine the 
attention value of any particular color or com- 
bination of colors, apart from such attention 
determining qualities of sensory appeal such 
as chromatic strength, reversal, dramatic 
quality. 

Representational material (pictures of things) 
is equally unsatisfactory since the primary de- 
terminer of attention may be the symbols in 
themselves, apart from simplicity or complexity 
of the design. For example, a religious paint- 
ing might very well attract strong attention 
from artist and non-artist alike if both were 
equally religious while the non-religious artist 
as wel as the non-religious layman might 
reject the painting because it lacked satis- 
factory design qualities, 

It became apparent that we were required to 
select, for our stimulus, designs which were 
neither representation nor abstract and which 
were monochromatic—which in fact were non- 
objective and which were black and white. 


The Experiment 


A well established technique for the measure- 
ment of differences in visual Sensitivity (dif- 
ferences, actually, in eye movements) is that 
of Brandt (2). His report on the use of the 
Bidimensional camera Suggested that this de- 
vice and teclinique might be suitable for our 
experiment. It was further decided to utilize 
Brandt’s plan of dividing the area into four 
stimulus areas. Design Charts A and B were 
executed in accordance with this plan, and 
with the above limiting qualifications: black 
and white and non-objective. The original 
designs were laid out on an area nine by fifteen 
inches, in the same Proportions as shown on the 
accompanying charts. These designs were 
planned, as is apparent in Figure 1, so that they 
proceed in complexity fr 


: om a single square to 
a cube with the interior exposed, and from a 


Walter A. Woods and James C. Boudreau 


CHART A a 
LL 
`q “Uo 
U” 


"RT 
L | / —— 


Fic. 1. Designs used in the experiment. 
s inter* 
single broken line to three broken lines ? 
rupted by diagonals. d 10 
eiman ereinen were cond ae s 
determine whether placement of the to the 
units or areas would influence response, nd 
areas. Designs were rotated, following was aP“ 
(2) in each of the four quadrants. eat in 
parent that position was not of sign! xcerned 
fluence insofar as individuals were viduals 
in the small sample used (eight in 104 
So, in view of economy consideration? ^ 
graphing and transcribing costs W sition 
hibitive), it was decided to eliminate [gen h 
a control for the balance of the enr each 
The following groups of twelve subje P desit? 
were selected for the experiment: F9 desist 
chart A: (1) Third year advertisint a) E 
students; (2) First year (Founda 3 
Students; (3) Secondary school stu” rp £l E, 
15-17 taking art courses on Saturday? cing ^ 
mentary school students age 12-14 U^, ctr 
Classes on Saturday; (5) Third year voficitf 
Engineering students selected for P 
in mathematics; and (6) Third year 
nomics (foods) students. For eem 
the following groups of twelve stu in£ de? 
Were used: (1) Third year advert? 


ro" 


Design Complexity as a Determiner of Visual Attention 35 


Students; (2) First year (Foundation) art 
Students; (3) Third year Electrical Engineering 
Students selected for proficiency in mathe- 
matics; and (4) Secondary school students age 
15-17 taking art classes on Saturday. 
Although each group originally consisted of 
twelve students (except for the Foundation art 
group which consisted of thirty students) the 
final number of completed records was in all 
Instances less than twelve. This situation 
came about because of faulty recording for the 
Most part due to the use of release positive 
Im. Actual numbers are indicated for each 
Stoup in each of the following tables. 
Employing the technique described by 
Tandt, the charts were placed in position on 
the rack of the camera and the subjects (ob- 
Severs) were asked to view the designs. 
Chart A was submitted first, to six groups; 
Chart B to four groups as noted above. The 
Original intention was to use only Chart A, but 
Pied the experiment had commenced, it was 
decided to use Chart B as well as Chart A. 
he observer was allowed to look at the chart 
?* nine seconds and then was instructed to 
: OS his eyes. Chart B was then placed in the 
t and the observer instructed to open his 
oy He was allowed eight seconds to ob- 
s Chart B. Observers were given no m- 
spe dns às to what to look for or how to ob- 
Str © the designs. They were simply in- 
Ucted: “You will be shown some designs. 
TS Your eyes when told to do so and close 
f Re When told to do so." 
in te DOnses of the observers were analyzed 
ioa of the following aspects of visual re- 
(o) Pec ) Mean time spent by each group 7 
of tot ie each area of each chart; (2) Per eal 
chart p time spent in viewing each area of Wes 
ations y each group; (3) Per cent of initial fix- 
cent a a € In each area by each group; (4) " 
Sach gy total eye fixations made in each area by 
made P; and (5) Average number of fixatio : 
Y each group in each area. Analysis o 
€ Was performed to determine variance 
in mean time spent by each am m 
Significa each area, and to determine level o 
Results of these means. "e 
follows S of the experiment are shown 1 : 
4) tar tables. Tt is indicated (Tables 1 an 
In Vinge” students tend to spend more time 
mg the complex areas and less time 1n 


Variang 
Tàtio, 


- 


Table 1 


Per Cent of Time Spent in Viewing Each of Four Areas 
of Design Chart A by Members of the Following Stu- 
dent Groups: (1) Third Year Home Economics (Foods); 
(2) Students Age 12-14 Taking Saturday Art Classes; 
(3) Third Year Electrical Engineering Students; (4) 
Students Age 15-17 Taking Saturday Art Classes; 
(5) Students in Advertising Design; and (6) Foundation 
(First Year) Art Students 


Per Cent of Time in Each Area 


Area Area Area Area 
Group N J 2 3 4 
Home Economics 6 20 24 26 30 
Saturday Art 
Age 12-14 9 15 17 31 37 
Electrical 
Engineering 8 20 22 22 35 
Saturday Art 
Age 15-17 10 20 20 20 39 
Advertising Design 9 12 20 22 46 
Foundation Art 20 14 15 21 50 


viewing the simple areas of the designs, whereas 
non-art students tend to distribute their time 
more evenly over the four areas. From Table 
3 it will be seen that the between group vari- 
ance exceeds the within group variance in 


Table 2 


Mean Time (in Seconds) Spent in Viewing Each of Four 
Areas of Design Chart A by Members of the Following 
Student Groups: (1) Third Year Home Economics 
(Foods); (2) Students Age 12-14 Taking Saturday Art 
Classes; (3) Third Year Electrical Engineering; (4) Stu- 
dents Age 15-17 Taking Saturday Art Classes; (5) 
Advertising Design; and (6) Foundation (First Year) 
Art Students 


Mean Time in Each Area 


Area Area Area Area 


Group N 1d 2 3 4 Total 


Home Economics 6 1.84 2.16 3.34 2.59 2.95 
Saturday Art 

Age 12-14 6 142 142 325 294 225 
Electrical 

Engineering 6 2.00 194 2.09 3.00 225 
Saturday Art 


Age 15-17 6 175 167 1.75 3.84 225 
Advertising 

Design 6 .92 184 200 434 225 
Foundation Art 6 1.17 1.00 194 484 2.95 
Total 36 1.52 1.68 222 3.59 225 


358 


Walter A. Woods and James C. Boudreau 


Table 3 


; n eer hart A by Members of 
TE iance of Mean Time Spent in Viewing Each of Four Areas of C 3 
SEE onsen Six Student Groups (Based on Data in Table 2) 


Sum of Mean F 
Source df Squares Square nd 

Advertising Design Total — 23 213 

Between areas 3 161.5 59.83 M 

Among individuals 20 51.5 17.16 wm 
Foundation Art Total 23 242 

Between areas 3 223.7 74.5 "ET 

Among individuals 20 8.3 E! 177.5 
Art Age 15-17 Total 23 152 

Between areas 3 80.3 26.8 " 

Among individuals 20 71.7 3.6 7.5 
Art Age 12-15 Total 23 202 

Between areas 3 68 22.7 

Among individuals 20 134 67 29 
Electrical Engineering Total 23 56 

Between areas 3 18.3 6.1 

Among individuals 20 311 19 52 
Home Economics Total 23 19 

Between areas 3 1.8 6 

Among individuals 20 17.2 9 im 
Area I Total 35 67 

Between groups 5 21.1 4.2 

Among individuals 30 45.9 La 2.8 
Area IT Total 35 58.3 

Between groups 5 17.8 3.6 

Among individuals 30 40.5 1.4 2.6 
Area IIT Total 35 124.9 

Between groups 5 35.9 7.2 

Among individuals 30 89 3 ah 
Area IV Total 35 241.6 

Between groups 5 83 16.6 

Among individuals 30 158.7 529 32 
n 143 884 

etween groups " 
Within groups E. dr p" 
Areas ^ 
Student groups : a 1469 
Interaction 15 1 Es hi 

Areas F equals 11.2** 
Groups F equals 0.0 


the ratio of 23.9:2.8. Thes 
indicate that a substantial Proportion of 
variability is due to differences between the 
groups, are significant at the .001 level. A 
variance ratio of 22.1:3.7 for Table 6 is equall 

significant and supports the hypothesis rad 
the various groups are (with exceptions noted 
by a more detailed analysis of variance and F 


€ data, which 


(qve 
lest found in Tables 3 and 6) representa" 
different populations. ear ar 

In viewing Chart A (Table 1) first * ^ 
Students spent the greatest per pw pP 
(fifty per cent of total time) in i 6 F 
V. advertising design students spe" ollo*^, 
Cent of their time in that area, and we? ed 
In order by the 15-17 age group, the : 


il 


`. 


= 


Design Complexity as a Determiner of Visual Attention 359 


group, engineering students and home eco- 
nomics students. It will be noted that the 
Young art groups (age 12-14 and 15-17) both 
devote more time to the complex areas than 
do the older third year college engineers and 
ome economics majors. It is indicated in 
this table that progressively more time is spent 
1m viewing the complex areas by the art student 
and by the non-artist, but that the rates of 
Progression are substantially greater for the 
art student. The same general pattern holds 
or Table 4 (per cent of time spent in viewing 
ed B) except that here it is indicated that 
engineer spends a somewhat greater pro- 
pris of time than the first year art student 
irae. the more complex area III. The 
a ayel simple area I demands less at- 
grou on by all except the secondary art student 
Bae this instance age (contrary to 
tiets, S findings) rather than artistic sophis- 
m~ On might be the determinant. 
uri 2 and 5 give mean time in seconds 
by each group in viewing Charts A and 
da give the same information (in different 
Provid = Tables 1 and 4. Tables 2 and 5 
Can - a basis for a determination of signifi- 
TA of differences and for analysis of variance. 
epee 7 and 8 indicate that initial eye fixa- 
areas in primarily established in the upper 
directe er I and II), and are primarily 
is etn to the right (observer's right). This 
trary ary to the findings of Brandt, and con- 
initial 9 the generally accepted belief that 
eye fixations, determined by reading 


Beg Table 4 

ot "eid of Time Spent in Viewing Each of Four Areas 

dent En Chart B by Members of the Following Stu- 

(2) Thi pape (1) Third Year Advertising Design; 

(First s Year Electrical Engineering; (3) Foundation 

car) Art Students; and (4) Students Age 15-17 
Taking Saturday Art Classes 


Per Cent of Time in Each Area 


Group N Prea Area Area Area 

Adian 1 
a crtisi x 
Electrica $ Design 8 1 43 16 30 
n rini) 9 14 35 19 2 
J li 
Saturday Act 4 14 32 0 3 

Sarl 1 23 295 18 295 


Table 5 


Mean Time (in Seconds) Spent in Viewing Each of Four 

Areas of Design Chart B by Members of the Following 

Student Groups: (1) Third Year Advertising Design; 

(2) Third Year Electrical Engineering; (3) Foundation 

(First Year) Art Students; and (4) Students Age 15-17 
Taking Saturday Art Classes 


Mean Time in Each Area 


Area Area Area Area 
Group N 1 2 3 4 Total 
Advertising 
Design 8 .87 3.5 135 238 20 
Electrical 


Engineering 9 119 288 138 206 20 
Foundation Art 14 1.87 2.25 


Saturday Art 
Age 15-17 11 106 2.94 1.3 2.88 2.0 


Total 42 1.25 2.80 1.28 2.58 20 


habits, are most frequently directed to the 
observer's left. It will be noted that the non- 
art groups and the younger art groups tend to 
make more initial fixations to the left (area I). 
Two explanations are suggested: (1) The indi- 
vidual who is less sensitive visually will be 
dominated by reading habits while the visually 
sophisticated will be more attentive to visual 
forms and less dominated by reading habits or, 
(2) The artistic individual (or the visually 
sensitive individual) will be less likely to have 
developed strongly established reading habits 
and is therefore less dominated by habits which 
stem from reading. The latter view is in con- 
ormity with other data which suggests that a 
slight negative correlation exists between per- 
formance in art, as measured by grades in art 
courses, and verbal ability, as measured by 
the L score of the American Council on Edu- 
cation Psychological Examination (7). 

Tables 9 and 10 demonstrate that the more 
artistically sophisticated groups tend to make 
fewer total fixations and tend to make a pro- 
portionately greater per cent of fixations in 
the more complex areas. The pattern of re- 
sponses of the 12-14 age group indicates fewer 
fixations than is found for the 15-17 group, 

Tables 3 and 6 give analysis of variance of 
the mean time spent in viewing each area by 
each group. Table 3 is based on means pro- 
vided in Table 1 derived from responses to 


360 


Walter A. Woods and James C. Boudreau 


. Table 6 


rs of 
V: i in Viewing E "our Area esign Chart B by Members 9 
A i i f Mean Time Spent in Viewing Each of Four Areas of Design ) 
victis Four Student Groups (Data from Table 2) 


Sum of Mean & 
Source of Variation df Squares Squares " F 
Advertising Design Total 31 256 
Between areas 3 135 45 a 
Among individuals 28 121 4.32 10.4 
Electrical Engineering Total 31 188 
Between areas 3 68.2 22.7 , 
Among individuals 28 119.8 4.28 5.3 
Foundation Art Total 31 238 
Between areas 3 105.2 35.1 ae 
Among individuals 28 132.8 4.74 T4 
Art Age 15-17 Total 31 64 
Between areas 3 23 7.7 ^" 
Among individuals 28 al 1.46 53 
Area I Total 31 56 
Between student groups 3 18.3 6.08 
Among individuals 28 37.6 13 AT 
Area II "Total 31 52.1 
Between student groups 3 1.3 43 
Among individuals 28 50.8 1.8 sa 
Area III Total 31 185.5 
Between student groups 3 25.1 84 
Among individuals 28 160.4 5.7 15 
Area IV "Total 31 170.2 
Between student groups 3 4.3 1.4 
Among individuals 28 165.9 5.9 42 
Total 127 746 
Between groups 15 331.5 22.1 
Within groups 112 14.5 3.7 
Areas . 3 2464 824 
Student occupations 3 0 0 
Interaction 9 85 9.45 m. 
Areas F equals 8.7* 
Groups F equals 0. 
Per Cent of Initial a Table S of 2 
“Bees of Design Chart B by Memang Pet Cent o Initial Fixations Made in = 
Four Student Groups Areas of Design Chart A by Mem 
—————— Six Student Groups Be 


N 1 a 3 

N I 2 La 

Advertising Design 8 125 125 62 " MEUS 
á T 1239 625° 125 Advertising pea, so 00 bg 15 
Spovidetion At = 4& T3 T ses Ti eai ae 3 E so 700 437 
mede pene 9 111 090 556 33.3 Saturday Ast 15 17 ti 363 0.0 a A 
aturday Art 15-17 1 364 00 454 18.1 Saturday Art 12-44 9 333 00 Rn o 
Total Electrical Ene; i 112 a 19 

al <_< 42 167 47 61.9 [167 Home Om aaa : Ee 0.0 333 


a 
D 
Ai 


Mt 


Design Complexity as a Determiner of Visual Attention 


Table 9 


Per Cent of Eye Fixations Made in Each of Four Areas 
of Design Chart B by Members of Four Student 


Groups and Mean Number of Fixations 


for Each Group in Each Area 


S 
Area 

" — WÁ€ — 1 

Toup 1 2 3 4 Areas 
Advertising % 208 208 310 274 

esign meanN 1.5 1.5 225 2.0 8.4 
sitim. % 187 27.5 27.5 263 
ngineer. meanN 1.67 244 244 2.33 8.9 
Foundation % 78 214 322 28.6 
Art meanN 1.78 214 322 2.86 10.0 
Saturday % 23.0 212 301 25.7 

rt 15-17 meanN 236 2.18 3.00 2.68 10.2 


c A. Table 6 is based on analysis of 
arlance of means in Table 3 and is derived 


Om responses to Chart B. 


In general there 


'S greater variation between area responses 
m E individuals in the groups. Tp 
Ee ot and engineering students tend p e 
amon vomogeneous in their visual patterns (less 

8 individuals variance) and tend to view 


Table 10 


Pe 
‘en of Eye Fixations Made in Each of Four Areas 
sign Chart A by Members of Six Student 


roups and Mean Number of Fixations 


for Each Group in Each Area 


Area 
M o sil 

Group 1 2 3 4 Areas 
Pe 
easing 95 217 206 250 337 
y En meanN 22 21 21 26 103 
Q " 
Ax on % 196 213 264 324 
» meanN 23 23 31 38 117 
“lect 
Bei. % 27.6 200 228 29.6 
S 7". menN 30 23 27 37 116 
atu 
Att uy % 23.8 224 252 286 
à 7 menN 32 30 34 38 134 
atu 
Ant oe % 229 202 266 303 
" 4^ menN 28 24 32 37 12 
Ome 
Economi % 220 26.8 22.0 29.2 

ics mean N 30 30 37 40 137 


361 


the charts in a more homogeneous manner (less 
between areas variance) than do art students. 
Foundation art students are homogeneous as a 
group, but exhibit great variability in their 
response to areas. Only in the instance of the 
12-14age group do we find that variability due 
to differences among individuals within the 
group is greater than the variability due to 
differences in area complexity. 

As to the variability which is brought about 
by differences between areas it may be noted 
that area IV (Table 3) draws the most variable 
response, both between the groups and among 
individuals within the groups, while the less 
complex areas I and III draw a much more 
uniform response. Variability due to differ- 
ences among individuals is less than the vari- 
ability due to differences between groups, ex- 
cept in the instance of area IV, wherein the 
among individuals variance is greater than the 
between groups variance. $ 

In the assignment of total variance for all 
groups, between groups variance is substanti- 
ally larger than within groups variance. The 
greater part of total variance must be assigned 
to differences between areas. 


Summary 


The present experiment attempted to arrive 
at some preliminary conclusions regarding the 
influence of design complexity in determining 
visual attraction or attention. It is indicated 
that art students tend to devote a greater pro- 
portion of their observation time to the more 
complex areas than do non-artists. Variance 
due to complexity of design is significant for 
the more sophisticated art groups, as are the 
differences in mean time spent in viewing the 
designs. However, significant differences are 
not found for the less sophisticated groups 
indicating that sensitivity to design cois 
plexity is a developmental process which in- 
creases with age and with level of artistic 
sophistication. The general pattern of data 
support the hypotheses that: (1) differences in 
visual sensitivity or visual awareness and at- 
tention do exist between artists and non-artists 
and between age groups; and (2) art groups 
are more sensitive to or pay more attention to 
more complex design units than do non-artists 


362 


when the factors of color and objectivity have 
been eliminated from the designs. 


Received November 4, 1949. 


References 


1. Birren, F. The story of color. Westport, Conn.: 
Crimson Press, 1941. 

2. Brandt, H.F. The psychology of seeing. New York: 
Philosophical Library, 1945. 

3. Buswell, G. T. How people look at pictures. Chi- 
cago: Univ. of Chicago Press, 1935. 

4. Ensor, J. Selections from his writings. In Gold- 
water, R., and Treves, M. (Ed.), Artists on art. 
New York: Pantheon, 1947. 


Walter A. Woods and James C. Boudreau 


5. Fry, R. Vision and design. London: Wm. Clower 
and Sons, 1924. 

. Graves, M. Theart of color and design. 
McGraw-Hill, 1941. 
7. Picasso, P. An interview. 
Treves, M. (Ed), Artists on art. New 
Pantheon, 1947. : dn 
8. Woods, W. A. Environmental influences ™ the es 
velopment of persistence in space-form mani pit 
tion and their relation to artistic potential. 
dissertation in process and yet unpu 
Columbia Univ. 

. Zadkin,O. The poetic climate of art. 
R., and Treves, M., Artists on art. 
Pantheon, 1947. 


New York: 


a 


In Goldwater, que 


plished, 


In Goldwatets 
New York: 


Ro} 


Verbal and Pictorial Questionnaires in Market Research * 


Joseph Weitz 
Carnegie Institule of Technology 


The purpose of this study was to compare the 
results obtained from two different types of 
questionnaires commonly used in market re- 


E : : - à : 
"e The two questionnaires used in this 

Udy were a verbal and a pictorial question- 
naire, 


tionnai Frequently in a single survey a ques- 
verb aire will be used which contains both 
rbal questions and a choice of pictorial items. 
e data Obtained from these two types of 
muestions are often treated similarly. It is 
Pd in the present study that different 
niques may be obtained using these two tech- 
ranted If this is so, it does not seem war- 
Nor is to treat the results in the same manner 
9 evaluate equally the data obtained from 

ese two sources, 


Procedure 


eus ^] ». to be studied in this survey is the 
ince Se | the cooking range. E his was chosen 
Very | 1S generally present in some form in 
aeg DM and hence people to be sab 
Question Would be familiar with it. ie 
One Der were compared, one verbal an 
Cookin orial, concerning the design of the 
8 range. The questions on the verbal 


Westionnaire eres follows 
(D) Do you prefer the table top (low oven) 
(2) D. the high oven? 

(3) Do You prefer a window in your oven? 
© you prefer a high or low location for 

(4) YQur broiler? 

bá: You prefer burner controls on the 
G) Whi vertical panel or the front panel? 
ich of the following burner arrange- 
Dents do you prefer? 
" Two burners on each side with work 
(b) pas in the center i ] 
Our burners on one side with work 
Space on the other side 


E) 
e 


(c) F 
) Our burners staggered across the 
entire top 
(a) Fo k with 
ur burners across the back w1 
xli Work space across the front 

e m" 

Mr, author wishes to express his appreciation to 


avi i : 
Behera d Ellies who did all of the interviewing an 
Ssisted throughout the entire project. 
363 


(e) Two burners on each side with a built- 
in griddle in the center 


(6) Which of the following do you prefer? 
(a) 


(b) 
(c) 


(d) 
(e) 


(7) If you had your choice, what color would 
you choose for your stove? 

(8) Do you prefer to have toe space at the 
base of your stove? 

(9) Do you prefer a hinged door or a drawer 
type storage area? 


Oven in the center with storage space 
on both sides 

Oven on the right with storage space 
on the left 

Oven on the left with storage space on 
the right 

Double oven 

A high oven with storage space below 


The visual questionnaire was composed of a 
series of sketches involving the same discrim- 
inations which were asked for in the verbal 
questionnaire. For example, in question three 
— do you prefer the high or low location for 
your broiler?—in the pictorial questionnaire 
two drawings were made, one with a high 
broiler and one with a low, both stoves having 
the same basic design (see Figure 1). The 
person being interviewed was asked which of 
these stoves she would prefer. The same was 
done for all of the other questions. 

For both groups other information was ob- 
tained, such as educational background, num- 
ber of years the individuals had used the par- 
ticular stove they now have, and what type of 
stove they were using at present. In this 
paper only the data pertinent to the original 
hypothesis will be presented. 

The survey was conducted in the city of 
Pittsburgh. The sample consisted of 200 
adult females. This total sample was divided 
into two groups of 100 each; one group re- 
ceived the verbal and one the pictorial ques- 
tionnaire. In each sample of 100, 1097, were 
from the A or highest socio-economic group; 
30% from the B socio-economic group; 40% 
from the C socio-economic group and 20% 
from the D socio-economic group. In this way 


364 


geste 


Fic. 1. Sketches used for question 3 in the 
Pictorial Questionnaire. 


the two groups were matched on the basis of 
socio-economic background.! 

One interviewer was used for all 200 Cases, 
This individual was an experienced inter- 
viewer. It was thought necessary to use only 
one interviewer so as to reduce that variable 
to a minimum. In both the verbal and pic- 
torial questionnaires the interviewer returned 
to those addresses where no one was home at 
the first call. Further, in administering both 
the verbal and pictorial questionnaire the in- 
ternal relationships of the sample were main- 
tained during the study. That is, all of the A 


1 Manual for research associates and interviewers. 
York: Market Research Divi: 


Corporation. Part 2, pp. 5-6. 


s New 
sion, The Psychological 


Joseph Weitz 


group were not interviewed before the B, pe 
for example, 1 A was interviewed, then 3 , 
then 4 C's, then 2 D's, etc., so that no one group 
was completed before starting the next. 


Results 


The results of the study are shown in T able ^ 
It can be seen that in all cases, with the e 
ception of question two, there was a pog m 
difference at least at the 1% level as dete 
mined by the Chi-Square test. p 

On several items there was a complete a 
versal of the preferred response from the bier 
to the pictorial questionnaire. In ques w 
seven, concerning color preference, the cO s 
other than white were grouped in one ap jo 
and white in the other category, giving er 
by two table rather than a five by two ta the 
which would have been the case had all nd 
colors been used (black, green, cream, blue 2 
white). 

Previously it was stated that even thone as 
attempt was made to equate the two re 
on the basis of educational background, i 
turned out to be quite similar. This "i 
seen from Figure 2. Since there wou are 
fewer than five cases in some cells if Chisan 
were computed separately for each od to 
economic group, all groups were combine dn 
give the total educational level of each eet 
From this combination of data the t 
Chi-Square results in a. value not Sign! “heck 
at the 30% level. This then is an added | oe 
on the homogeneity of the two samples that 
and would lend weight to the assumption arily 
any differences which exist are due uum in- 
to the type of questionnaire used. g" 1 pack- 
terest to note the change in educationa 
ground from the A group to the D etm 

It can be seen from Table 1 that the e. j 
obtained from the pictorial and the hom" 
questionnaire cannot be considered aS : wa 
geneous. Since the sampling techniq? was 
identical for both samples and since thet Sona 
evidence of homogeneity of the en seem 
background of the two groups, it woul 
evident that the differences observed. 47° ch 
to the differences in the questionnaire esis © 
nique. If this is so, the original hypoth tially 
substantiated and one must differen ay? 
evaluate results obtained in these two Y 


no 


"^ - 


Verbal and. Pictorial Questionnaires in Market Research 365 
Table 1 
"T Preferences of Interviewees in Each Sample 
Number Preferring 
: Chi-Square ^ 
Question Choice Ver. Pic. Value P 
l. High or low oven High 23 41 8.24 010 
2. Wi . Low TY 59 
4. Window in oven Yes 82 77 164 .300 to .500 
f na No 18 23 
*- Broiler location High 4 60 7.22 .010 
4 B Low 59 40 
+ Burner controls Front 9 27 10.98 .001 
5n Back 91 73 
> Burner arrangement 2 each side 19 36 15.06 010 
4 one side 57 30 
4 across top 7 12 
4 across back 11 11 
6.0 2 each side griddle 6 11 
* Yven arrangement Center 14 18 25.04 .001 
Right 61 30 
Left 6 5 
Double 5 5 
re High 14 42 
+ Color White 88 55 27.13 001 
8. T, Not white 12 45 
* Toe space Yes 94 65 25.77 .001 
9 5 No 6 35 
* Storage Drawer 31 67 26.32 001 
Hinge 69 33 
ae 


oe GROUP A a a a C 


EDUCATIONAL BACKGROUND|| 


Fic, 2. Educational background of 


VERBAL 


each socio-economic group in eac! 


LZ 


EB PICTORIAL 


h sample. 


366 


The present study throws no light on the 
important problem of which. of these two 
methods is more valid. That is, if a survey is 
to predict consumer behavior it should ob- 
vioüsly be important to know which of these 
techniques, pictorial or verbal questionnaire, 
comes closer to actual buying behavior. Fur- 
ther research must be done in order to compare 
the usefulness of these two techniques with re- 
spect to their predictive value for consumer 
behavior. 

It should be pointed out that it is possible 
that had other pictures been used there might 
have been different results; therefore the pic- 
torial representation itself might be studied to 
determine the amount of variability obtained 


with various forms of visual presentation of 
questions. 


Joseph Weitz 


Summary 


Two different questionnaires were adminis- 
tered to two samples of 100 each. These 
samples were matched for socio-economic 
status and were homogeneous with respect : 
educational background. One group receive d 
à pictorial questionnaire, one group pe 
a verbal questionnaire. Statistically signifi 
cant differences were obtained between the v 
techniques. It is concluded from this study 
that one cannot use these two questioni 
techniques interchangeably and that the da. ! 
obtained from these two methods should no 
be equally evaluated. 


Received November 7, 1949. 


An Exploratory Study of Linear Interpolation 


Harry Kreiger Miller, Jr. 


Lehigh Un 


The psychological problem of estimating the 
position of a point relative to two markers be- 
tween which it falls is of vital importance in 
Studies of dial and scale reading. Until re- 
cently there has been comparatively little sci- 
entific research published on this phase of inter- 
polation, Work which has been carried out 
has indicated that there are three explanations 
for the errors which occur in such interpola- 
tions: (a) individual differences; (b) size of the 
interpolated interval and of the markers; and 

C) biases, 

Recognition that individual differences play 
an important part in this problem is indicated 
N à study on dial readings by Kappauf, Smith, 
and Bray (7) when they came to the conclusion 
that “subject differences and subject inter- 
actions appear to demand an analysis of the 
data Subject by subject rather than analysis 
9! group averages.” In another study (9) on 
interpolation between circular scale markers, 

eyzorek found that “individuals differ signifi- 
Cantly in their abilities to perform this kind of 
Visual interpolation.” : 

hat accuracy of interpolation. varies ac- 
cording to the size of the interval has been ob- 
served in most of the studies on this subject, 
Including the two just mentioned (7) and (9). 

Tether and Williams (5) concentrated on this 
: ctor in their experiment on accuracy of dial 
Me and in another study (8) on we i 
i graduation Kappauf and Smith foun e 

quency of dial reading errors was guru y 
gi, tion of the size of the scale unit. a 
me of the markers also has some effect, x 
a ‘tioned by Bäckström (1) and some of the 

9rementioneg, w-— 

— “lses—or errors not randomly distributed 
àre the third category of causes of errors. 
of apanis (3) found that whether the nte 

* Point was underestimated or overestimated 
fous eed and Duvoison (2) and Leyzo 

"d some evidence of the same type ° 


"505. Reed and Bartlett (10) and Harriman j 
36 


iversity 


and Bartlett (6) studied these biases more 
specifically in experiments on concentric rings 
and positions along a short line, respectively. 
Bückstróm (1) in studies on a limited number 
of subjects found that biases were of greater 
influence than random errors as a cause of 
discrepancies. 


Purpose 


The purpose of this research was to system- 
atically study the accuracy of visual interpola- 
tion with five different interval sizes, each 
having interpolated positions 1 through 9, 
using twenty-one subjects of differing age and 
occupation. 


Procedure 


Figure 1 shows a pattern of the 2 mm. size. 
Six different patterns of problems were drawn 
up originally using an interval of 10 mm. Each 
pattern consisted of 54 problems, of which there 
were six each of the nine different positions 
(1, 2, 3, 4, 5, 6, 7, 8, 9) randomly arranged. 

Direct prints were made of the original six 
patterns thus giving the 10 mm. interval. 
Photographic reductions were used for the 
5 mm., 3 mm., 2 mm., and 1 mm. interval sizes. 
These were then placed on ordinary photo- 
graphic mounting board. 


‘Pe aly se 
7 6 3 | id i i 
i3 4 is lie iz ig 
id 26 ài | à 23 24 
35 26 27 | 28 29 36 
3i 3d 38 | 34 35 36 
37 36 39 | 40 4i 4 
43 44 45 | 46 At 48 
49 50 5i 83 54 84 
Fic. 1. Sample pattern of 2 mm. size. i 


368 


Slotted shields were used to permit viewing 
one horizontal row of problems at a time. 
Mimeograph answer sheets were provided. 

Each subject was given one size of pattern 
at a time with the corresponding shield and 
answer sheets. He was asked to estimate in 
tenths the position of the inner line in relation 
to the two outside markers and to enter his 
judgment on the appropriate place on the 
answer sheet. He was told that the inner line 
always fell on an exact tenth (i.e., 1, 2, 3, 4, 5, 
6, 7, 8, or 9). He was permitted to work at 
his own speed but was asked to be as accurate 

as possible with each interpolation. Use of the 
shield was optional, so that it was possible for 
him to compare problems against each other. 
Judgments were made at normal reading dis- 
tance and no attempt was made to control 
lighting. No measurement of the subject’s 
visual acuity was made. . 
When the subject finished one set he was 
given the opportunity of doing another size 
immediately or of resting his eyes before at- 
tempting the next group. The sets were not 
given in any systematic order, but rather the 


Harry Kreiger Miller, Jr. 


subject was given his choice as to what order 
he desired to work on them. However, each 
individual completed all the problems of one 
size before being given the next set. 2 

There were 520 judgments (6 patterns of 54 
problems each) for each interval size, giving 
a grand total of 1,620 estimations by each 
subject. 


Results 


A tabulation of the total number of errors 
according to the size of the interval is made 
in Table 1. (Occasional mistakes due to 
reversal in recording—i.e., placing a 1 on the 
answer sheet for a 9, or a 2 for an 8—were not 
counted as errors of interpolation.) ET. 

'The figures indicate that individual differ- 
ences are of primary importance, Since the 
totals extend from 3 errors for the most : 
curate subject to 538 errors for the leas 
accurate. " 

Neither occupation nor sex is indicated as Rr 
major controlling factor. Of the four subjec 


; an 
making the fewest errors, two are males ? 


ing 


: ineer 
two are females, one is a graduate engine? 


Table 1 


Number of Errors According to Size of Interval 


Size of Interval 


Subject Occupation Age Sex 1mm. 2mm. 3mm. 5mm. 10mm. 3 
RH Grad. Eng. Stud. 1 : 
FH Housewife 5 E ù i i ^ 0 i 
GD Grad. Educ. Stud. 21 f 3 0 0 0 1 1 
LO Undergrad. Eng. Stud. 27 m 1 ») 3 1 à 
RD Undergrad. Eng. Stud. 25 m 3 1 : 0 " 
FB Housewife 23 f 7 1 : : 1 ^ 

"HK Grad. Psych. Stud. 26 m 10 0 i i 6 * 
RR IBM Operator 21 m 6 4 : : 0 A 
GB College Registrar 55 -m 22 5 ^ : 0 " 
FF Grad. Psych. Stud. 24 m 14 3 p : 7 T 
BG 7th Grade Stud. 12 m 35 “4 : : 4 p 
EH Undergrad. Psych. Stud. — 23 m 6 ya T s 36 Er 
ME Office Clerk 18 f 33 S si = 11 ir 
JC Housewife 23 f 38 1 a 4 16 us 
WM Undergrad. Bus. Stud. 25 m 42 s = “4 ps 
m Accounting Clerk 50 m 58 i : " i st 

Housewife 50 f ^ D 
E. B 1 X 3 à Z » @ 
pA ex a 12 m 116 43 36 3 37 pe 

WJ Sea ^ » E : € n E id s á 

relay 29 — 165 86 — 9 90 106 2 


An Exploratory Study of Linear I nterpolation 360 

'Table 2 

Number of Errors According to Interpolated Position 
Note: L = Lower than true position; H = Higher than true position. 
Interpolated Position 
1 2 3 4 5 6 7 8 9 Totals 

Subje — — Grand 
Subject L y L H LALA LH LH LH LH LH L E Total 
BG or 1 3 wo no a p iu oe 02 $0 dk SS 
EH 0 20 $5 12 6 1 OM Os 9 21 04 00 7 78 85 
ME 05 2 6 5§ 5 7 6 60 55 6 3 2 6 30 56 3 9% 
JU 0 4 $2 2 Fred "9 3 30 1 44 15 3 70 B 42 123 
Wu quum qoo quo ard TARS Bae tE au 408 09 40 
HM cmo je didt me A7 Sak HS 2 4 T 0 59 128 187 
Di d 2 3 d Us x de d dU y TH 44 FF $9 2 106 199 
ES  (Q 4 2 2 at 5 65 6 19 4 D 47 434 4 0 40 154 102 256 
BL gye Qo oz 36 1 2014 i Iw 945 4 0 112 152 264 
BH. om 3 qe oid £8 ee sd) Dae Tos d 84 272 356 
M 0 99 9 01 36 59 9 m SB s ó5 12 49 7 4] 5 0 101 437 538 


un, one is a housewife, one a graduate edu- 
lon student, and one a second year under- 
(rate engineering student. Of the four sub- 
oa, making the most errors, three are females 
one a male, all of different occupations. 
Although most of the subjects are in the age- 
€ of 18 through 28, the spread in total 
gro aber of errors of the individuals outside this 
is d lends credence to the inference that age 
t a primary factor in accuracy. . 
to " Comparing the number of errors according 
he size of the interval it was again found 
thous matter of individual differences. Al- 
on gh the highest number of errors were made 
the 1 mm, size by eleven of the subjects, 
Uu Were two who had the greatest number 
the š Stakes on the 2 mm. size, three who had 
ayuu ME inaccuracy on the 3 mm. Size, two 
in Meri the most difficulty on the 5 mm. 
most ks and one who found the 10 mm. Et 
additi ifficult for correct interpolation. s 
— one subject had an equal number - 
the on the 2 mm. and 5 mm., and one foun 
ata on. and 3 mm. equally difficult. These 
is ng "pparently point up the fact that there 
one optimum size for all subjects. 
ossible the data were tabulated to jo 
jects į € bias due to position, the first ten sub- 
p te Table 1 showed so little error that = 
eleven. Indication of bias. However, «à wa 
Tesults Subjects did show some trend an t per 
are included in Table 2. The smalles 


number of errors is found on the 1 and 2, and 
the 8 and 9 positions, the extremes of the scale. 
The 5 position, the center of the scale, had com- 
paratively few errors. The maximum number 
of misjudgments were made on the 4 and 6 
positions. The 4’s tended to be read as 3’s, 
and the 6’s as 7’s, indicating a bias outward 
from the center. The 7’s tended to be read as 
8’s, and the 3’s as 2’s. 

Individual differences, however, are ap- 
parent. For example, subject BH overesti- 
mated the 3's, 4’s, and 5’s, and underestimated 
the 7’s. Subject WJ made more mistakes on 
the 1’s and 2’s than on any of the others, 
whereas most subjects were comparatively 
accurate on these positions. Subject JC was 
least accurate on the 5’s. Some subjects, for 
example EH, HM, BH, and WJ, made more 
overestimations, while others, as ME, JC, WM, 
and ES, had a greater number of under- 
estimations. 

Although the small sample of subjects does 
not permit statistical generalizations, the 
unusually high level of performance of some 
of the individuals is worthy of note. Four 
had less than 14 of 1% errors, and the majority 
of subjects had less than 6% errors. No mis- 
judgments were made on the 1 mm. size by 
the two most accurate subjects; this means 
that they discriminated differences of one- 
tenth of a millimeter. 

Although some general trends are evident, 


370 


the individual is paramount above the size of 
the interval. For most subjects it became in- 
creasingly difficult to interpolate as the interval 
sizes decreased. Nevertheless, in some cases 
there was no decrease in accuracy and a few 
subjects did better at the smaller sizes. 


Summary 


In this study twenty-one subjects of differing 
age and occupation visually interpolated five 
interval sizes (1 mm., 2 mm., 3 mm., 5 mm., 
and 10 mm.), each size having 324 problems. 
The problems were randomly arranged in six 
different patterns, each pattern containing an 
equal number of the interpolated positions 1 
through 9. The subjects were asked to be as 
accurate as possible and were given as much 
time as they needed to complete each interval 
size. 

Eight of the subjects made less than thirty 
errors on the 1,670 problems. Results indi- 
cated that individual differences were of greater 


influence than interval size Or biases due to 
position differences, 


Received November 7, 1949, 


References 


1. Bäckström, H. Die Dez 
lesen von symmetrisch 
Beständigkeit und 


imalgleichung beim Ab- 
schen Skalen, ihre zeitliche 
ihre Abhängigkeit von der 


Harry Kreiger Miller, Jr. 


r Ü Z. 
Ablesungsgeschwindigkeit, der l bung; u.a. 
Instrumkde, 1940, 60, 231-237; 261-271. 


joi G. 
2. Bartlett, N. R., Reed, J. D., and Duvoison, 


9. 


10. 


- Chapanis, A. 


an, C. 
- Chapanis, A., Garner, W. R., and Morgan, 


. Grether, W. F. and Williams, A. C. J™ 


- Harriman, M. W. and Bartlett, N. R- 


- Kappauf, W. E., Jr. and Smith, W. M. 


Estimations of distance on polar coordinate bv 
as a function of the scale used. Systems i ix 
The Johns Hopkins University. Report ^ 
sont pg J of interpolation Due 
scale markers as a function of scale interv 
number. Amer. Psychologist, 1947, 2, 346. -— 


E -isual dis- 
Instrument dials and legibility, and paier 
plays. Ch. V and VI in Applied exper 
psychology. New York: Wiley, 1949. Speed 
i ion of 
and accuracy of dial readings as a. ipse 
dial diameter and angular UE Psycho- 
divisions. Ch. 7 in Fitts, P. M. (Ec cabine: 
logical research on. equipment design. 101-109. 
ton: Gov't Printing Office, 1947, pp- Estimation 


A a a short 
of the relative position of a mark along 


i i ication.) 
line. (In press; private communicatio 


> Bray: 
. Kappauf, W. E, Jr Smith, W. M., and 


; ading 
C. W. A methodological study of dig! nivel 
Department of Psychology, Princeton 
sity. Report No. 3, August 1947. 


nary experiment on the effect of dial £ yo 
and dial size on the speed and kom press? 
readings. Ann, N. Y. Acad. Sc, 1948 „rpolation 
Leyzorek, M. Accuracy of visual inte tion o 
between circular scale markers as a fur E 


the separation between markers. J+ exp. 
chol., 1949, 39, 270-279. plished data 
Reed, J. D. and Bartlett, N.R. Unpu Univ: 


E kins 
Systems Research, The Johns Hopki 
(Private communication.) 


ae 


Book Reviews 


Libo, L.M. Altitude prediction in labor rela- 
Üions—A test of understanding. Studies in 
Industrial Relations, No. 10, Division of In- 
dustrial Relations, Graduate School of 
Business, Stanford University, 1948. Pp. 
15. $1.00. 


The thesis of this monograph is that one first 
Step to good industrial relations is the under- 
Standing on the part of labor and management 
of the points of view of the other party. Such 
Understanding together with willingness to 
understand, ability to learn, and availability 
of the other party builds improved inter- 
Personal and intergroup relationships. The 
extent of this understanding can be measured 

y Comparing for each party its beliefs con- 
cerning the attitudes of the other party with 
the actual views of the other party. The 
Pini then, essentially has two parts—the 
Arguments concerning the major thesis and an 
r UStrative attempt to measure the degree of 
Understanding between labor and management. 

t is apparent that this study holds that the 
data to improved labor-management rela- 

Ons is clarification. One might expect, there- 
me that both concepts and terminology 

ould be considered quite rigorously. While 
© author has striven mightily in this direc- 
Who] in the reviewer's opinion he m hers 
Cone Y succeeded in hitting his mark. Libo s 

o “pt of understanding appears to change 

v, time to time. In some instances 1t seems 

*- “fer simply to the accuracy with which en 
While Perceives the point of view of the ot n: 
those es Others it seems to imply a P 
Parts PM of view. Similarly in certa i 
to be 9! the discussion understanding appear: 

Es necessary condition for improvement : 
this posi agement relations and in other paris 
Whet Sition is denied. It is never quite cl 


bise understanding is concerned with 
oft edge of the attitudes, motivations, etc» 


like," Other party or the prediction of the most 
s havior under given conditions. — . 
beliey 2 illustration of the way in which Libo 
Admins, Understanding can be measured, he 
Cemeg Ted some 24 opinion questions con- 
With labor-management issues to * 


group of 44 labor leaders and to a group of 33 
industrial relations directors. The members of 
each group not only indicated their opinions 
but also those that they would expect of the 
other group. Through an analysis of the re- 
sponses one is able not only to observe similari- 
ties and differences between the opinions of the 
two groups but also to observe the accuracy 
with which each group can predict—i.e., per- 
ceives—the opinions of the other. 

On all but two of the 24 issues the two parties 
expressed opposite opinions. Management’s 
predictions of the opinions of labor were found 
to be considerably more accurate than labor’s 
predictions of the opinions of management. 
Generalizations of these findings are particu- 
larly hazardous in view of the fact that the two 
groups utilized do not have direct dealings with 
each other. The labor group consisted of rep- 
resentatives of the International Longshore- 
men’s and Warehousemen’s Union (CIO), and 
the management group were members of the 
California Personnel Management Association 
and each was from a different company. 

On the positive side it can be said that the 
entire monograph is most stimulating. It is 
fruitful of ideas and should provoke further 
interesting discussion and research. Those 
interested in problems connected with such 
studies undoubtedly can profit from Libo’s 
analysis of the problem of measuring under- 
standing. 

For those who may wish to refer to this 
monograph, it will be noted that the date is not 
cited. The reviewer has been informed that 
the date of publication is August 1948. 

Edwin E. Ghiselli 


University of California, 
Berkeley, California 
Van Der Lugt, M. J. A. V. D. L. Adult psy- 
chomotor test series for the measurement of 
manual ability. New York: New York Uni- 
versity, 1948 (mimeographed); V. D. L. 
Psychomotor test series for children for the 
measurement of manual ability. New York: 
New York University, 1948 (mimeographed). 
This series of ten relatively simple tests “for 
the measurement of manual ability” requirés 


371 


372 


only a small amount of test equipment and 
simple instructions for administration and 
scoring. The reliabilities for each test are 
said to be “over .90.” Five basic components 
of speed, pressure, accuracy, motor memory 
and coordination are each represented by two 
tests, as follows: 1. Speed-prehension; 2. Speed- 
asynkinesia; 3. Pressure-reproduction; 4. Pres- 
sure-control; 5. Accuracy-steadiness; 6. Ac- 
curacy-precision; 7. Motor memory-direction; 
8. Motor memory-spatial; 9. Coordination- 
static; and 10. Coordination-dynamic. 

Thus far it would seem that the battery 
fulfilled most of the requirements of useful 
measuring devices, since it samples most of the 
basic aspects (speed, etc.) of manual activities, 
is simple, objectively scored, and has a reason- 
able number of cases for its tentative norms. 

However the basic problem of psychomotor 
testing has nearly always been one of validity, 
or “what the tests purport to measure." 
Ideally a psychomotor test battery should 
measure representative samples of each com- 
ponent of motor action, e.g., speed, precision, 
etc., so that the subscores could be used to de- 
scribe or predict an individual’s 
tude for each component of any 
which he might be expected to d 

Several coefficients of validity 
relations with five point ratings by employers 
of skill on the job in small factories: r—.73 for 
women and .65 for men on the static coordina- 
tion test, and .51 for women and .58 for men 
on the dynamic coordination test. Unfor- 
tunately the numbers of cases are described 
only as "small" and therefore cannot be 
evaluated as to their representativeness, 

The only adequate tests of validity for these 
tests would be a series of correlations with a 


skill or apti- 
complex skill 
evelop. 

are cited, cor- 


Book Reviews 


variety of practical manual performances, aN 
on groups of persons large enough to perm! 
known statistical significance and with ee 
of known reliability. Fine (as contrasted wit 
gross) psychomotor tests have long been “pen 
to be quite specific in nature or related pes 
within narrow group factors of other tests, at li 
the validity of fine motor tests for the pre s 
tion of manual skills has been is ped 
demonstrated for only a few practical skills 
(air crew specialists and rifle marksmen). 

In view of the above facts the reviewer mey 
be pardoned for urging a very strong "CC 
as to the appropriateness of naming the V. ihe 
battery a psychomotor test series "for - 
measurement of manual ability." The hat 
viewer’s own research strongly suggest p 
various components such as speed © d by 
motor skills are not primarily determine? di- 
any stable biological characteristic of the s 
vidual so much as they are determined PY ^ 
happening to hit upon qualitatively di fic to 
work methods which are often quite ipe n 
each skill studied (cf. Seashore, R. H. motor 
retical and experimental analyses of fine 28". 
skills. Amer. J. Psychol., 1940, 53, a vill 

It is to be hoped that Dr. Van Der ps ice 
soon be able to furnish more definite eatery 
as to the validity or non-validity of the 52 


pre- 
: «tion of a ICE 
or its subtests for the prediction manua 


he 


$ ional, cli 
battery for the appraisal of educatio divid ua 


Robert H. Seashor? 


Northwestern Un iversity 


—— 


Tes 
Chap, ^ CO. 1950. Pp. 296. $380. 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possibli 
Department of Psychology, University of 


Beginning experimental psychology. S. Howard Bartley. 
New York: McGraw-Hill Book Co., Inc., 1950. Pp- 
483. $4.00. 

Teaching the child to read. Guy L. Bond and Eva Bond 
Wagner. Revised edition. New York: The Mac- 
millan Co., 1950. Pp. 467. $3.75. 

4 history of experimental psychology. Second edition. 
Edwin G. Boring. New York: Appleton-Century- 
Crofts, Inc., 1950, Pp. 775. $6.00. 2 

Developing men for controllership. T. F. Bradshaw. 
Boston: Division of Research, Harvard Business 
School, 1930. pp.232. $325. 

Selected readings in social psychology. Steuart Hender- 
Son Britt, Editor. New York: Rinehart and Co., 

, Inc, 1950, Pp, 507. $2.00. 

did experiments in psychology. Second edition. 
cland W, Crafts, Elsa Robinson, Theodore C. 
chneirla, and Ralph W. Gilbert. New York: Mc- 

Graw-Hill Book Co., Inc., 1950. Pp.491. $3.50. 


Boy. 
Rating em ployee and supervisory performance. M. Joseph 
New York: 


Ooher and Vivienne Marquis, Editors. 


À T Management Association, 1950. Pp. 192. 


Mandbook of employee selection. Roy M. Dorcus and 
Targaret H. Jones. New York: McGraw-Hill Book 
9. Inc, 1950, Pp. 349. $4.50. 

Veir Mitchell, novelist and physician. 
iladelphia: University of Pennsylvania Press, 
e E 278. $3.50. PA 
Your real self, David Harold Fink. New York: 

Sis and om. Inc. 1950. Pp. 307. $295. 
ie your health. J, Roswell Gallagher. pow 4 
So € Research Associates, Inc, 1950. Pp. 4$. 


Ernest Earnest. 
1950. 


Studies ; r " 
dies in leadership: leadership and democratic action. 


vin W. Gouldner, Editor. New York: Harper and 


tothe 
he rat 1950. Pp. 768. $5.00. 


teory of mental tests. Harold Gulliksen. New 
$6.00; John Wiley and Sons, Inc, 1950. Pp. 462. 


and Mal- 


ener, PT 
col al clinical counseling. Milton E. iinan B 


Co. Ihe MacLean. New York: McGraw- 
T 1950. Pp. 373. $3.50. 

ing the handicapped in the rehabilitation proces 

eaneth W, Hamilton, New York: The Ronald 


A. Hamrin. 
Knight Pub- 
Co., 1950. Pp. 224. $3.00. 


Hat Use psycholo doertising. 
t £y for belter adver! 
Wick. “New York: Prentice-Hall Inc-s 
$5.75, 


Melvin S. 
1950. 


373 


e review should be sent to Donald G. Paterson, Editor, 
Minnesota, Minneapolis 14, Minnesota 


Speech therapy for the physically handicapped, Sara 
Stinchfield Hawk. Stanford: Stanford University 
Press, 1950. $4.00. 

Testing results in social casework. J. McV. Hunt, 
Margaret Blenkner, and Leonard S. Kogan. New 
York: Family Service Association of America, 1950. 
Pp. 64. $2.00. 

Measuring results in social casework. J. McV. Hunt 

_and Leonard S. Kogan. New York: Family Service 
Association of America, 1950. Pp. 79. $1.50. 

Adolescent development. Elizabeth B. Hurlock. New 
York: McGraw-Hill Book Co., Inc., 1950. Pp. 566. 
$4.50. 

The principles of psychology. 
York: Dover Publications, Inc., 
$7.50. 

Psychology in everyday living. Ralph L. Johns. New 
York: Harper and Brothers, 1950. Pp. 564. $3.50. 

A comparison of diagnostic and functional casework con- 
cepts. Cora Kasius, Editor. New York: Family 
Service Association of America, 1950. Pp. 169. 
$2.00. 

Tensions affecting international understanding. Otto 
Klineberg. New York: Social Science Research Coun- 
cil, 1950. Pp. 227. Cloth, $2.25; Paper, $1.75. 

The growth and development of executives. Myles L. 
Mace. Boston: Division of Research, Harvard Busi- 
ness School, 1950. Pp. 200. $3.25. 

The meaning of anxiety. Rollo May. New York: The 
Ronald Press Co., 1950. Pp. 376. $4.50. 

Surveys, polls and samples. Mildred B. Parten. New 
York: Harper and Brothers, 1950. Pp. 624. $5.00. 

The criminality of women. Otto Pollak. Philadelphia: 
University of Pennsylvania Press, 1950. Pp. 180. 
$3.50. 

The Porleus Maze Test and intelligence. Stanley D. 
Porteus. Palo Alto: Pacific Books, 1950. Pp. 194, 
$4.00. 

Psychology, @ biosocial study of behavior, E. Terry 
Prothro and P. T. Teska. Boston: Ginn and Co. 
1950. Pp.546. $3.75. i 

Psychological problems in mental deficiency. Seymour 
B. Sarason. New York: Harper and Brothers, 1950. 
Pp. 365. $5.00. 

Religion and the cure of souls in Jungs psychology. 
Hans Schaer. New York: Pantheon Books, Inc, 
1950. Pp. 221. $3.50. " 

Occupational information. Carroll L. Shartle. New 
York: Prentice-Hall, Inc., 1950. Pp. 339. $3.50. 

Analytic group psychotherapy. S. R. Slavson. New 
York: Columbia University Press, 1950. Pp. 275. 
$3.50. s 


William James. New 
1950. Pp. 700. 


374 
New 
Books, Monographs, and Pamphlets 


David N. 
Lawrence. 
] Business 


The psycholo, 
gy of mental , heal: 
health. Louis P 
. Thorpe. Management behavior and f. 
v nd foreman attitude. 


New York: Th 
: The Ronald Press C 5 
o., 1950. Pp. 747. Ulrich, Donald R. B 
i E . Booz, and Paul R 
Division of Research, end 


$5.00. 
Educational ps; 
ychology. Willi 
Houghton Mifflin Co illiam Clark Trow. Boston: School, 1950: 5 
is as ie qe Pp.56. $75 
undis. O fi Ki 
e m m A New York: The 
y, . Pp.148. $3.00. 


philo- 


Journal of Applied Psychology 


Vor. 34, No. 6 


DECEMBER, 1950 


The Concentric Organization Chart 


C. G. Browne 


Wayne University 


os ge of organization charts to illustrate 
the TSA and departmental relationships and 
[mend of authority and responsibility in any 
pol ze situation has long been a generally 
Ps a procedure, Such charts perform the 
tions} tunctions with these non-quantified rela- 
atà Mp variables as graphs do with quantified 
A same Led organization chart also offers the 
advant, vantages of the graph, primarily the 
of the age of easier and quicker comprehension 

€ material under consideration. 
iie e the purpose of this article to present a 
trated ; organization chart diagramming, illus- 
the in Figure 2, which will be referred to as 
co organization chart, and to in- 
elma advantages of _this diagrammatic 
Comm ation of relationships over the more 
Chart only used or traditional organization 
E" illustrated in Figure 1. Both Figures 
Miete are based upon the organizational 
Ployin re of a manufacturing company em- 
the Ez approximately 1,500 persons, all of 
€cutives through the fourth echelon of 
Usiness being included in the charts 


illustrate, d. 


m 1S proposed that, on the basis of the fol- 
tion a Considerations, the concentric organiza- 

Í the art is a more satisfactory representation 
tionay P'aterial being presented than the tradi- 

Organization chart. 

Provides a better representation of the dy- 
ips as they exist in 
The concentric or- 


nami 
d Hh Personal relationsh 
Niza Mzational structure. i [ 
Telatio s Chart represents dynamic, flowing 
tow, Ships which proceed and are directed 


< Yard + l 
time ü the center or inwardly, while at the same 
P s and is direct- 


ro» we ÎS a flow which proceed ect 
e center or outwardly. Characteristi- 
> t isti- 
n Cen functioning of any group or organiza- 
ters “around” individuals, not “below 
375 


them. This is clearly presented in the circular 
plan in Figure 2, culminating in this case with 
the President and General Manager as the 
focal point for the activities of the other execu- 
tives involved. Or it might be said that 
individuals in a group, whether executive or 
supervisory or not, are surrounded with con- 
tacts, influences, and relationships that stimu- 
late them from all directions. The schematic 
representation of these factors is better dia- 
grammed as a circular rather than a horizontal 
plane. These flowing, surrounding, focalizing 
relationships then take on the desirable aspects 
of a smoothly functioning organizational unit 
in that they represent schematically the co- 
operation, intercoordination, and mutual de- 
pendency which characterize successful group 
activities. 

2. Eliminates the “above and below" concept. 
The importance of the principles of general 
semantics is gaining recognition in psychology, 
particularly where it can be shown that human 
relationships and reactions are causally related 
to words as conceptual symbols. Many words, 
phrases, and concepts take on or are promoted 
for a certain emotional significance which in- 
fluences the individual. Most individuals 
readily recognize various levels of authority 
and responsibility, and will accept their per- 
sonal position in relation to these various levels. 
However, an emotional aspect attaches itself 
to this recognition when it becomes a reflection 
on the status of the individual in the concept 
of above or below, higher or lower, superior or 
inferior, up or down, top or bottom. The 
concentric organization chart eliminates this 
emotional concept, and diagrams organiza- 
tional structure in such a way that each indi- 
vidual is a supporting part of the structure, the 
entire structure being dependent on all of these 


ios) 


5 C. G. Browne 
76 


Dinero or 


Pus Rewarows 


Perooucr 


Enomece 


Fro. 1. Traditional type organization chart. Note: A. Mgr. Shipping; B. Mgr. Product? 


C. Mgr. Quality Control; D. Plant Engineer; E. Industrial Engineer; F. Personnel Director; and G. 


Bicycle Tire Products. 


NGR 
SALES 
ORDERS 


PANY 

p= 

Sm | Pui 
AG 


INDUSTRIAL 


NEER 
PLANT 
ENGINEER 
TECHNICAL 
DIRECTOR 
PRODUCT 
| ees 
CHIEF 
CHEMIST 


ndicate echelon level; black connecting 
f authority, 


COST 
ACCOUNTANT 


GR 
Quauity 
CONTROL 


CONTROL 


Fic. 2. Concentric organization chart. Note: Circles j 


lines indicate flow o! 


n Controli 
Foreman 


The Concentric Organization Chart 


parts for its operation, since a break in any one 
of the concentric circles has an effect upon the 
entire organization and its functioning. Vari- 
ous echelons of authority are represented by 
the distance which any given circle is from the 
Center or focal point. 

3. Presents an organization without loose ends. 
An organization is a self-contained unit within 
the sphere of its own operations. However, 
the traditional organization chart does not 
represent this, unless some limits are super- 
'mposed on the chart itself. Instead, the ends 
9r the right and left of each echelon level repre- 
Sent a drop-off or vacuum of the organization's 
activities which is not characteristic of the 
“nctioning of and the relationships existing 
within the organization. In effect, the con- 
"E e Organization chart brings these loose 
wu eal into circular form in a diagram 
in i represents a self-contained unit function- 
u 9n & continuum within its own boundaries 

Contained in the chart, all of which portray 

y actual functioning of the organization. 
xx cieliminales the upside-down gem am 
ganz 
tant has etn e esum dos ion 
a uri gagne. eae tinea or daa tha 
Ttrani of the relationships existing within tht 
ion zation are not dependent upon the posi- 

n of the diagram which represents them. 

wd "ganization chart is designed to present 
i wi um about a given organization ata 
Not de, ime. These facts or relationships o 
a Pend on viewing the organization from 
Ww Particular angle or any specific position, 
Banizati also the case for the concentric or- 
tional 9n chart. However, with the tradi- 
Which it ganization chart, the relationships 
Proper] intends to portray can be eue 

the vies Only when the chart is presented to 
the, Wer in a certain position, that is, with 
botton? at the top and the bottom at the 
“ation '* Should this not be done, the organi- 
Shi eM be upside down, with wy 
Otga i sting inversely to the facts of the 

5. cational structure. . 
After Plifies designing and understanding- 
qm d xs facts of the organization da x 
he Aran ned, there is a difficult prob em 

Sement of these facts on the diagram. 


377 


This arrangement problem is particularly diffi- 
cult on a traditional organization chart 
primarily because of the physical limitations of 
this type of chart. It is inevitable that the 
number of individuals will increase as the 
organizational set-up proceeds further and 
further from the focal point, or echelon one. 
That is, there is only one president, there may 
be four or five vice-presidents and other officers, 
15 or 20 department heads, several hundred 
foremen, and thousands of non-supervisory 
personnel. However, on the traditional or- 
ganization chart, the amount of space available 
for say the fifth or sixth echelon with 100 or 
more individuals is exactly the same space 
which is available for echelon one with one 
individual, the available space being a function 
of the width of the chart. This leads to many 
attempts and deviations by planners of or- 
ganization charts to circumvent this problem 
in the best manner possible by changing the 
size of the box designating the individual 
(Figure 1, echelon 3), by raising or lowering the 
echelon level of some individuals in so far as 
the chart is concerned to an echelon where 
space is available (leading to confusion in 
following the relationships), by grouping indi- 
viduals, or by any number of other devices. 

The nature of the concentric organization 
chart automatically eliminates this primary 
problem of space in designing the chart. As 
illustrated in Figure 2, the necessity of in- 
cluding 11 executives on echelon three presents 
no problem since the third echelon circle is 
larger than the first and second echelon circles, 
where the number of executives is smaller, 
The same principle would follow in any ex- 
pansion of the chart. This makes it possible 
to maintain a constant plan for showing the 
relationships between echelons and individuals, 
and for following these relationships through 
the system of connecting lines on the chart, 
which is not possible with the complicated 
net-work of lines and jumps to the left, right, 
up, or down which frequently create problems 
of interpretation and understanding with the 
traditional organization chart. 


Received January 16, 1950. 


Custom Made Systems of Job Evaluation 


J. Stanley Gray 


University of Georgia 


Jobs have been evaluated since the beginning 
of employment but only in the last 25 years 
have any attempts been made to do this 
systematically. The term “job evaluation 
now designates a systematic! attempt to weigh 
separately the worth of various elements or 
factors which constitute a job. This evaluation 
may be in money values (factor comparison), 
or point values (point systems), or some form 
of over-all classification. It is (or should be) 
preceded by job analysis, which also has been 
systematized in recent years. In other words, 
the trend in job evaluation and its concomitant 
activities has been toward systematization, 
with the assumption that more system will 
somehow result in greater validity. 

This article presents one method of validating 
systems of job evaluation by elementary sta- 
tistical procedure. Assuming that job evalua- 
tion is the determination of relativ 
G.e., the equitable distribution of 
large or small) within an establis| 
other foundation theses are her 
First, relative job values can b 


best by evaluating only those fac 
variables. 


in all jobs b 


je job values 
a payroll— 
hment, two 
€ presented. 
€ determined 
tors which are 
If a factor exists in equal amount 
eing evaluated in an establishment, 
it is useless as a means of determining relative 
job values. This means that factors peculiar 
to an establishment must be used in evaluating 
jobs in that establishment, In other words 
custom made systems of job evaluation must 
be developed to fit each new job evaluation 
situation. 


Y of any system of job 
ctor comparison) can be 


1The word "scientific? 


3 is ofte 
instead. ^ erroneously used 


378 


i ach 
7 arried by € 
and the percentage of wages carr 


of the key jobs. -— 
On the basis of these assumptior , 
has developed and validated cus 


the writer 
om made 


x nts 
F stablishmer 
systems of job evaluation for estab 


x r the 
in both steel and textile industries "€ the 
non-academic jobs in all 16 p ars Eo. 
University System in Georgia. : “apps ia 
a job evaluation system was deve 


j 
ated these properly. to double” 
tions, two lists of key jobs were used E 
test the validity of the system. described J 

The procedure used may be 
seven steps: 


care 
are: first ugh 


1. All jobs to be evaluated thoro 
fully analyzed and described. oe put $4 
schedule of job analysis is approp jower e 
one developed by the War Me ainin va 
mission and reported in the lysis 
Reference Manual for Job ame d 
Printing Office, 1944 is peel those $ ore 
often job evaluators study only man i 
included in their ready made syste When a) 
other important aspects of a job. . 
is analyzed before the evaluatio to gs 
known, the analysis is likely or other " 
thorough and more appropriate m 
Such as employment, training; € nalys? 9 ple? 
2. The next step is a careful tani T tof 
job analyses to discover the impo ation a pai 
which are to become the eet ed 
Some of the standard cage x i 
tions, experience, hazards, Prirrerentia egi 
are investigated to see if they C» cy uM i 
from each other. If they are aser s 
from job to job they are not use that ar ja 
factors. The number of factor? peret" gad" 
depends on the number of ples are 
characteristics. Frequency t% mr. 
for each factor investigated. rs P. t fy 
3. The differentiating pent we Pa) 
Covered are then assigned |t 4 wo" 
cording to the estimated impor". tb 


€ writer arbitrarily uses 


des 


Custom Made Systems of Job Evaluation 


basic weight for all factors. The value of 
each factor is then the per cent of its importance 
in relation to other factors. This constitutes 
the value of the first degree. Higher degrees 
are arithmetically increased by multiplying 
this base value by the degree number. Thus 
factor A (perhaps working conditions) may 
Carry a weight of 10 as a base (first degree) and 
20, 30, 40, etc. for higher degrees. This means 
that if the first degrees of all factors total 100, 
the second degrees will total 200, the third 
300, etc. Most point systems of job evalua- 
tion determine factor and degree weights in 
this way, Illustrative weights are given in 
Table 1, 
A part of the weighting step in developing 
* custom made system of job evaluation, and 
yet a part that should come only after the 
Weighting has been completed, is the develop- 
ment of a manual. Each factor must be care- 
fully defined, as well as each degree under 
each factor, A job evaluation manual is 
Mtended to be used as a guide so that the 
tating of jobs will not be difficult. How much 
ot à trait a job may have should be easily 
“cided by referring to the manual. Job 
valuation manuals should be written in the 
eat simple yet the most exact language pos- 
alt. They should be brief yet not ambiguous. 
€ various key jobs can be added later as 
Ypical examples of the degree meaning of 
each factor. 
e The fourth step is to select 15 or 20 key 
ae ench mark jobs, that is, jobs which already 
in TY proper pay rates. These may be selected 
Various ways, but it is well to have an 
why committee composed of individuals 
to b are intimately acquainted with the jobs 
an € evaluated? Men who have worked in 
Select blishment for years are better able to 
on] ct key jobs than is a job evaluator who = 
"4 a short and incomplete knowledge of jo 
tes and even less acquaintance with pay 
These key jobs should be representative 
the jobs to be evaluated. Tf some depart- 
Opini has no job that is properly paid, Lai 
Selec On of the committee, a typical job ca! 
ted and a fair wage assigned to it even 


2 
a aoth management and labor should be represented 
tegardi Committee, It can be used later to a Eu 
Jobs, Ng the evaluation of more difficult and complex 


379 


Table 1 


Factor and Degree Weights for a System of 
Job Evaluation 


Degrees 


I. HH IJI IV V 


1. Education 10 20 30 40 50 
2. Experience 20 40 60 80 100 
3. Physical Effort 10 20 30 40 50 
4. Mental Effort 10 20 30 40 50 
5. Working Conditions 10 20 30 40 50 
6. Hazards 10 20 30 40 50 
7. Responsibility for 

Safety 5 30 4 6 75 
8. Tools, Materials, 

Equipment 15 30 45 60 75 


though the actual wages are higher or lower. 
A key job with a hypothetical fair wage is just 
as good as one with an actual fair wage. With 
this system, jobs can be evaluated only as 
accurately as the rates on key jobs are accurate. 

5. Obviously the next step is to evaluate key 
jobs, using the table of factors and weights as 
developed in Step 3. When the total point 
value of each key job is determined, a table of 
evaluation points and wages should be con- 
structed. See Table 2, columns 2 and 4, 
Then the per cent of total evaluation points 
as well as the per cent of total wages should 
be calculated for each key job. See Table 2, 
columns 3 and 5. The difference between the 
percentage of total evaluation points and the 
percentage of total wages carried by a job 
indicates the accuracy of the job evaluation. 
See Table 2, column 6. The statistical signifi- 
cance of this percentage difference should then 
be calculated. See Table 2, column 7. If this 
difference is significant, it is evidence that the 
job is not accurately evaluated and the table 
of factor weights must be adjusted. Only 
when all key jobs are evaluated in such a way 
that no difference between percentage of total 
evaluation points and percentage of total wages 
is statistically significant can the evaluation 
system be called valid. 

6. It is well to double check a job evaluation 
system, especially if the point values of factors 
and degrees had to be juggled repeatedly to 
fit the key jobs. This necessitates a second 
list of key jobs chosen with the same care as the 


J. Stanley Gray 
380 
Table 2 
Key Jobs with Wage and Point Values 
l Per Cent Per Cent : S S.D. of k 
p of Hourly of i Differ- ni Differenc 
i Y Tota ence 
Points Total Wage 
Jobs - 028 
5 5 d g 
Die Sinker A 348 10.3 $1.95 10.5 n e 015 
e Sinker A. 3 » i i `; T 
Hammersmith — Ps Ye = 0.1 66 ^ 16 
Millwright (Maintenance) 305 9.0 3 ei es "m ii 
Machinist (Maintenance) 286 84 m rs a LÀ 
Shaper Operator 275 8.1 ed zo A 2 m 
Crane Hitcher 260 Tt 145 H nf 89 A 
Grinder (Cutters) 241 TIL 1.30 7.0 5n ^ " 
Inspector (Bench) 229 6.8 1.25 = Rn 37 i 
Scale Blower (Forge) 214 6.3 1.18 6. 55 e 
Furnace Unloader 204 6.0 1.10 5.9 s p 0. s 
Roll Mill Helper 190 5.6 1.05 5.7 53 0. 
Tester (Brinell) 179 5.3 1.00 54 di x = 
Drill Press Opr. 172 5f 0.95 5.1 He Ts 
Straightener 164 4.8 0.90 4.8 | 
Total 3388 100.0 $18.58 100.0 


first. The jobs should be evaluated and a 
table constructed as described in Step 5. 
Again, if the differences between per cent of 
total points and per cent of total wages are 
great enough to be statistically significant, 
the evaluation system is not yel valid. Only 
when a job evaluation system justifies the 


wages of the key jobs can it be said to be valid 
enough to justify using it to evaluate other 
jobs which carry wage rates that may be too 
high, or too low, or just right. 


7. The final step is to evaluate other jobs 
in the plant for which the System has been 
designed? This means the preparation of an 
evaluation sheet for each job, a summary sheet 


appr valuating clerical i 

managerial jobs, or technical j Bical, jobs 9r 
jobs are differentiated by di 
be evaluated by the use of 


«ops, and # 
showing the factor weights for all a : 
double entry graph showing p p These 
and present wage rates of all jo nation co 
sults are then presented to the eva mendatio! i 
mittee for their review and pea pn those 
These final steps are no different £T 
any other job evaluation system. 


Summary n 


^ ve 
is gee on ha 
Two innovations in job evaluat! e 


: are 
proposed in this paper. First, joes 
tively evaluated by factors that ben 
ferentiating Characteristics E: ation 5“. te 
Ready made systems of job evalua ifferent 
fit because their factors do not tion 
between jobs. Valid job eo d 
must fit the jobs being dap validate 
Systems of job evaluation can sficance f 
computing the statistical sign! n of E^: gy 
differences between the evaluatio zey J9P* 
and the wage rates they carry- carry P. 
defined as those that already aid un 5 
wage rates. When properly Pag ste 
properly evaluated the evaluat 
then valid. 


Received January 6, 1949. 


Rating Training and Experience 


Laverne K. Burke and Erwin K. Taylor 


Personnel Research Section, AGO* 


, One of the most expensive, laborious and 
time-consuming chores in the examining pro- 
cess of any merit system is the evaluation of 
training and experience of applicants. It is 
also one of the least reliable and least objective 
aspects of the entire examining process. 

_ As the importance of the position under con- 
Sideration increases there is generally a con- 
Comitant increase in both the weight given to 
background factors and in the difficulty of 
evaluation. The task frequently requires the 
Services of subject-matter experts beyond the 
Staff of most examining agencies. Where the 
number of applicants is large, difficulty in 
Securing an adequate number of qualified 
raters with time to devote to the task fre- 
quently delays completion of the register fora 
Matter of months. Additional problems are 
Introduced by the need to train raters to 
achieve some degree of uniformity in standards 
and to assure the applicants of bias-free evalua- 
tions, 

A task of similar nature faced the Depart- 
Ment of the Army when, after the war, a need 
Was felt to re-evaluate Army officers. To ac- 
complish the job the Officers Efficiency Evalua- 
tion Board was appointed to place all officers 
M a relative order of merit. An Army Per- 
Sonnel Records Board made available to the 
“Valuation Board brief but accurate and com- 
p is data on each officer's commissioned 
Service prior to 1 January 1947. 

he Army Personnel Records Board devel- 
goed a form which has become known as the 
wc atement of Service. On it the Board re- 
otded each assignment, rank, rating, promo- 
lon, transfer, decoration, and disciplinary 

ction for the period from January, 1937 
hrough December, 1946. Pertinent remarks 

Superiors were also extracted from the 
D apis reports and recorded on T 
oa Covering all service prior to 
riefly summarized. 


* 

` sand 
do The Opinions expressed are those of hoas Ure 
Policy. necessarily represent official Dept. 


Complete statements of service were sent to 
the Officers Efficiency Evaluation Board. 
This group was charged with the responsibility 
of reviewing the record and arriving at an 
evaluation of the individual officer’s value to 
the Army. The task of reviewing and rating 
some twenty to thirty thousand statements of 
service was indeed formidable. The Personnel 
Research Section of the Adjutant General’s 
Office was called upon for assistance and was 
charged with the development of an objective 
clerical procedure whereby an initial ordering 
of officers could be accomplished. Members of 


Table 1 
Estimated Reliability of Rankings, by Branch 
and by Sample 


(In terms of average intercorrelation among 8 sets 
of 10 ranks each) 


Sample 

Branch A B 
Cavalry .80 84 
Coast Artillery .90 .91 
Engineers 89 83 
Field Artillery .90 74 
Finance .68 .82 
Infantry 1 94 93 
Infantry 2 88 . 74 
Ordnance 88 .69 
Quartermaster -90 83 
Signal Corps 83 81 


the board planned to examine this listing and 
the records and make any necessary adjust- 
ments required by factors not taken into 
account by the first approximation. The 
Officers Efficiency Evaluation Board’s final 
evaluation, called the War Service Score, cor- 
related .86 with the system presented here. 
For purposes of analysis, a stratified sample 
of 200 lieutenant colonels was selected. 


1 The first approximation actually used was not the 
one here presented, but rather an improved System 
consisting of weighted composites of level of responsi- 
bility and efficiency reports. 


^ 


381 


382 


Table 2 


Correlation between Records Variables and Average 
Branch Rankings, Sample A 


Note: All record variables, unless otherwise speci- 
v gh for the period from 1 Tati: 1937 through 31 Dec. 
1946. 'Those marked by an asterisk are for the war 
period from 1 July 1943 through 30 June 1945. 


Correlation 

Variable Coefficient 
1. Average efficiency rating 81 
*2. Average efficiency rating 64 
3. Average temporary grade held 39 
*4, Average temporary grade held 56 
5. Average: echelon of job X grade held 59 
*6. Average: echelon of job X grade held .62 
7. Per cent of war period spent overseas 25 
8. Age when Lt. Col. grade received = 31 


9. Present age =31 
10. Number of times detailed to General Staff 33 


11. Total Army training received 40 
12, Highest Army training received ES 
13. Civilian education 24 
14. Vears of enlisted service, not Regular 

Army .09 
15. Years of commissioned Service, not Regu- 

lar Army —.09 
16. Years of enlisted service, Regular Army 02 
17. Years of commissioned Service, Regular 

Army .02 
18. Hospitalized, neurosis 01 
19. Hospitalized, physical —.8 


20. Highest grade reached 


.69 
21. Average temporary grade held (1942- 


1946) .68 
22. Highest efficiency rating 45 
23. Lowest efficiency rating 69 
24. Range of efficiency rating —.68 
25. Highest echelon assignment "m 
26. Total decorations received ‘0 
27. Highest decoration received ‘67 
28. Total number of assignments (1937-1941) ~19 
29. Total number of assignments (1942-1046) — 43 
30. Number of months unassigned, non. j 

hospital —20 
$1. Number of months unassigned, hospita] —.. di 
32. Demotions 63 
A ar of special aptitudes by superiors 23 
35. Weight ca 
36. Weight-height ratio ~ 06 


Twenty were selected at random f, 
of eight branches and 40 from infantry. The 
sample from each branch was sub-divided inte 


Laverne K. Burke and Erwin K. Taylor 


two groups, so that one stratified enile o 
100 could be used for developing the evalua a 
system and the other comparable sample cou 
be used for checking or cross validation. dis 
For each branch subsample of ten cases : 
statements of service and special ranking ae 
were sent to the respective Career -— 
ment Branches. In each of these branc le- 
the ten officers in each subsample el 
pendently ranked by eight officers who s leter- 
the statements of service carefully. To ¢ dent 
mine the reliability of these iie 
rankings, the average carmen hs ae 
rankers by branch was computed. The 
presented in Table 1. f serv 
Careful analysis of the statement of ation. 
revealed 36 variables capable of patiia 
The remarks made by superiors when r2 ani 
the officers were examined for the pps M 
type of descriptive words and gu o 
However, consideration of the tend E re- 
some superiors to put more desc He js, even 
marks in their rating reports than dom length 
when rating the same officer, plus th number 
of time required to tally and check ree ud 
of different words used indicated : ald not 
analysis of the narrative material wO justi 
yield variables adequately reliable to 
their inclusion in the study. ariables 
The intercorrelations among the 36 vavera 
from the statements of service and = officer? 
branch rankings were computed for 9 rrelatio” 
in sample A. Table 2 presents the Co averag? 
of each of the 36 variables with the -elation® 
Tank. Table 3 presents the intercor? ne 


among the 36 variables.’ Apply ca o i 
Wherry-Doolittle Test-Selection techn ert 
battery consisting of variables: reache 
efficiency Teport; (20) highest LY nigh? 
(23) lowest efficiency report; and (2 
decoration received, was developed. averaÉ 
The correlation of this battery with ¢ 


2 Two oi 
and 


ice , 


j 
t 


wat 
« oners 9" Ade: 
fficers in this sample had been prot be Ple 
evaluation of their war service cou space ed 5 
9 reduce printing costs and to save e ordo resti 
has been deposited with ADI and may tation a, 10 
Document 2749 from American Documer T e o 
tute, 1719 N Street, N.W., Washington y high Ui 
sung $0.50 for microfilm (images, 1 P^ 80.30 i 
Standard 35 mm. motion picture film 

Photocopies (6 x 8 inches) readable with ation 49 
a. tead, Shartle, & Associates. Occup k Co 
7n, 


Techni z: Am. Boo 
Pp. 2 A M riques, New York 


Rating Training and Experience , 383 


Table 4 


Correlation between Branch Rankings and Composite 
of 4 Records Variables, by Branch 
and by Sample 
Note: The combining weights were derived from the 
data of sample A and were used in combining the four 
variables for both samples. 


Sample 

Branch A B 
Total .90 .85 
Cavalry 93 93 
Coast Artillery .90 .98 
‘Engineers 97 92 
Field Artillery 98 48 
Finance 4T .82 
Infantry 1 99. 82 
Infantry 2 94 98 
Ordnance 93 82 
Quartermaster 97 .93 
Signal Corps 79 98 


Tank by branch is given in Table 4. To make 
Sure that the high correlations in sample A 
On which the test selection was based) were 
Not merely a function of sampling error, the 
attery was also applied to the independent 
Sample B. The correlation of battery scores 
With average ranking for sample B is also 
Presented in Table 4. : 

t is interesting to note that there is better 
agreement between the battery score and the 
average ranking than there is between pairs of 
r°™mpetent evaluators working from the same 

asic data, This fact argues strongly for the 


use of a mechanical system of evaluating back- 
ground items. Such a system not only results 
in considerable savings of time, money, and 
personnel but results in an evaluation that is 
consistently more reliable than the more 
laborious method of subjective evaluation. 


Summary 


As a possible means of evaluating Army 
officers’ merit the Personnel Research Sec- 
tion developed a simple mechanical system of 
evaluation whereby four variables are objec- 
tively rated by reference to a set of speci- 
fied weights. The system was cross-checked 
against a second.group of officers and found 
to be effective. The evaluations made by the 
system agreed more closely with the consensus 
of groups of eight competent evaluators than 
pairs of evaluators agreed with each other in 
the subjective rating of the officers’ entire 
records. 

Tt is suggested that the technique employed 
could be of considerable service to merit 
systems. We estimate that it would be worth- 
while doing when there are in excess of a 
thousand applicants to be rated. The system 
has the advantage that it will usually be 
possible to secure much more competent evalu- 
ators to make the trial rankings than could be 
secured to rate the entire applicant population. 
It is thus thought that application of the 
method described will not only save time and 
effort but will result in better evaluations 


as well. 


Received. February 1, 1950. 


Measuring the Level of Abstraction 


Rudolf Flesch 
Dobbs Ferry, N. Y. 


Students of communication agree that 
awareness of the level of abstraction is essential 
for full comprehension. For example, Perrin, 
in his college text, says (14): *For exact and 
reasonable communication it is highly im- 
portant that a speaker or writer knows where 
in the range of meaning of abstract words his 
core of meaning falls and that he makes this 
clear to his listeners or readers." Seman- 
ticists are especially emphatic on this point. 
Johnson (10) writes: “The prime objective of 
general semantics is to make one conscious of 
abstracting.” Hayakawa (8) writes: “Con- 
sciousness of abstracting is... a sign of 
adulthood." 

Though the significance of the level of 
straction is widely recognized, no studies have 
been reported that attempt to estimate or 
measure it quantitatively. Discussions are 
usually limited to generalized distinctions be- 

tween abstract and concrete words, illustra- 
tive examples, and such figurative devices as 
the semanticists’ "abstraction ladder." 

The present study is an attempt to approach 
this problem with the technique developed for 
the measurement of readability, 

In recent years a number of Statistical 
formulas to estimate readability (comprehen- 
sion difficulty) have been developed (e.g., 3 
12), among them a four-part formula reported 
by the writer in 1948 (4, also 5), 
was based on the follow; 


ab- 


word length in syllables; (3) average percent- 
age of “personal words"; (4) avera e 
percentage of “personal Sentences,” š 

To adapt the techni 


limiting adjectives, finite verbs, a po. 
nouns, and coordinating conjunctions. À a 
of these statistical facts have never been us e 
for the measurement of readability, but di 
of them have been the basis of such x 
studies as Boder's development of (hejet ie 
verb quotient (1), Fries’ analysis y ly o 
language of the Bible (7), LaBrant's stuc em 
subordination in children's writing de o 
Séchéhaye's study on the logical am d af 
the sentence (18). The present study Rene 
attmept to combine and extend these s T 
analyses of parts of speech and bring mi 
line with currently accepted gemm ó 
classifications, as exemplified in the wor 
Curme (2) and Jespersen (9). ang the 
To sum up the hypothesis underly Pee be 
present study: Level of rige eg certain 
estimated by computing the ratio O f speech 
parts of speech to certain other parts absttac” 
in written expression, Since level ae mpre” 
tion is a basic element in readability = as à 
hension difficulty), this ratio can be e or in 
measure of readability, either by itse 
combination with other elements. 


Procedure 


e ' iter's earlie? 
The criterion used in the wri Cra 


5 
readability formula (4) was McCall- “since 
Standard test lessons in reading ( ? grade 
this criterion is keyed to children’s P for 
levels, it seemed also a useful -- is ge" 
measuring the level of abstraction. row? 
erally agreed that children’s menta ndle ab- 
corresponds to a growing ability to ha 6, 11): 
stractions (see, for instance, Piaget, e as e 

To make the prediction as accurate c Call 
Sible, the following 42 of the en dire 
Crabbs’ Passages (containing Ru 
tions, or arithmetical problems) a 45, i 
from the count: Book II, Nos. 4 T 15 um 
89, 92, 93, 94; Book III, Nos. 6, 11 8b, 2, ^ 
32, 36, 41, 46, 51, 52, 63, 76, 78, 82; 75; BO 
Book IV, Nos, 8, 17, 31, 37, 43, a, ter 
V, Nos. 1, 21, 32, 52, 64, 79, 80, 85 7 ^ rit 

9 test the level of abstraction, t mati? 


gu D the generally accepted 87? 
4 


hor — i — — ma 
a——— —— —— ——— 


Measuring the Level of Abstraction 


Categories of the parts of speech. It was 
found that most parts of speech contained 
Certain categories that were statistically related 
to abstractness and certain others related to 
concreteness. In general, words related to ab- 
Stractness are more indefinite, those related to 
Concreteness more definite. The writer, there- 
fore, chose the arbitrary label “definite words” 
for the words whose percentage was used to 
measure concreteness (level of abstraction). 
The categories of "definite words," arrived 
at by a process of trial and error, were these 
(for more detailed definitions, see the section 
ow to Use the Formula" below): 


(1) The following nouns: Common and 
Proper nouns with natural gender; common 
"es Proper nouns specifying time; nouns in 
the Dossessive case ending in 's or s'; and nouns 
Modified by the limiting adjectives listed below. 

2) The following limiting adjectives: Pos- 
ide adjectives; intensifying adjectives; 

Meral adjectives; the adjectives what, this, 

Gy that, those, each, same, both; and the if 
odifying à noun not otherwise modified. 

as O? Finite verbs, except the verb /o be used 
$ Copula, 

to x Present participles ending in ing if used 

«I the progressive tense. 
um The following pronouns: Personal pro- 
ibus" reflexive pronouns; the relative pro- 
tive un whose, whom, ete (a pe d 
i s 3 in 3 

es y Sid sg the liste 
6 as pronouns. 
then he following adverbs: here, there, now, 

(7 Where, when, why, how. 

( All interjections. 

he words yes and no. 


e Percentage of these “definite words" in 
“st passages was correlated with the grade 
9f children who answered correctly one- 


y formula 


- 4 axl- 
Expressed in such a way that m 


385 


mum readability had a value of 100 and mini- 
mum readability a value of 0. 
Findings 

The correlation between the percentage of 
“definite words” and the average grade level 
of children who could answer correctly one- 
half of the test questions was found to be 
r= — .5541. The corresponding regression 
formula is: Cso = 8.7493 — .0852 dw. 

On the basis of this evidence, the percentage 
of “definite words” may be considered useful 
to form a rough estimate of the level of ab- 
straction, ranging from 0 (fully abstract) to 100 
(fully concrete). 

The percentage of “definite words” may also 
be combined with average word length in 


Table 1 


Correlations, Means, Standard Deviations, and 
Regression Weights of Word Length 
and “Definite Words” 


dw Cso X S B 
wl = —.5681 6908 135.1437 14.0030 .5552 
dw —.5541 37.9790 9.1720 —.2387 
Table 2 
Means and Standard Deviations of Two Criteria 
x s 
Co 5.5135 1.4096 
Ca 7.1302 2.0454 


syllables to form a measure of readability. 
The intercorrelations, means, standard devia- 
tions, and regressions weights found are shown 
in Tables 1 and 2. The following symbols 
were used: wl for word length (syllables per 100 
words), dw for percentage of “definite words,” 
Cso for the average grade of children who 
could answer one-half of the test questions 
correctly, and Cz; for the average grade of 
children who could answer three-quarters of 
the test questions correctly. 

The regression formula based on these cor- 
relations is: R = 168.095 + .532dw — .811 wl 
(R represents the readability score). 

Scores computed by this formula have a 
range from 0 to 100 for almost all samples 


386 


taken from ordinary prose. A score of 100 
corresponds to the prediction that a child who 
has completed fourth grade will be able to 
answer correctly three-quarters of the test 
questions to be asked about the passage that 
is being rated; in other words, a score of 100 
indicates reading matter that is understandable 
for persons who have completed fourth grade 
and are, in the language of the U. S. Census, 
barely "functionally literate.” The range of 
100 points was arrived at by multiplying the 
grade level prediction by 10, so that a point 
on the formula scale corresponds to one-tenth 
of a grade. However, this relationship holds 
true only up to about seventh grade; beyond 
that, the formula underrates grade level to an 
increasing degree. Finally, the formula— 
which predicted grade level and, therefore, 
difficulty—was "turned around" by reversing 
the signs to predict readability. (Before this 
transformation the formula read: C7; = .0811 
wl — .0532 dw — 1.8005.) The multiple cor- 
relation coefficient of this formula is R = .72. 
Since the correlations with the criterion Cso 
were higher than those with the criterion Cs, 
the multiple correlation with the criterion 
Cs was computed first; as a second step, the 
value so found was used to predict criterion 
Cys, in order to predict 75% comprehension 
rather than 50% comprehension. 


Comment 


The percentage of “definite words” appears 
to be a useful test in two ways: 

First, it is a rough measure of the level of 
abstraction. As such, it may be used as a tool 
in semantic studies, critical reading, literary 
appreciation, translation, rating of advertising 
copy, and propaganda analysis. Step-by-step 
analysis of the level of abstraction in a given 
piece of discourse may also be helpful in logical 
analysis and the discovery of faults in reasoning 

Second, the new test is a measure of 
readability. As such it replaces the two tests 
of “human interest” (percentage of "personal 


words" and "personal sentences”) th: 
i at forme 
part B of the writer’s earlier f 3 


I I ormula (5), 
Combined with the average word SE t 
syllables, it gives a practical and compre- 


hensive measure of readability. 
It should be noted that this measure ap- 


Rudolf Flesch 


reine 
proaches the problem of readability in pe 
way, since it emphasizes the importa ini 
newspaperman's “Five W's" (Who: if 
Where? When? Why?), the role of names @ - 
addresses, facts and figures, and the enin 
practical examples, illustrations, and s tod 
Negatively, it points up the bad a eher 
many adjectives, abstract nouns, an her 
verb forms. It may, therefore, be expec oe 
become a useful training device in the teac 
of composition. í 
According to Shannon's mathematica? Wi 
of communication (19, 20), any noise d 
ing the transmission of a message i5 CO itself. 
acted by redundancy in the n 
By extension, any "semantic noise. pue. 
hension difficulty, ambiguity, etc) - petition. 
acted by "semantic redundancy w^ ). 
padding, amplification, restatement, € level o 
other words, information on a high © 
abstraction, which is difficult to grasp, 
municated more effectively to the exte 
restatements on a more concrete ds to 
added. This, of course, correspon ü d 
use of illustrations, practical applica te 
amples, parables, and the like 1 sossibly 
The new measure of abstraction bm 
be used in conjunction with the ma 
theory of communication to estime rece 
ber and level of concrete illustrations to natio? 
to convey a given piece of abstract 10 4 
to a given audience. en 1, 
This general principle has long s f arked 
ognized in practice and there "T rnate po 
tendency in modern writing to alte” ret 
tween abstract generalizations an 
examples. Because of that, the PCT 
“definite words” often fluctuates Ro 5 
tremes within a single piece of WP nece 
may make more or longer samples te- 
in order to arrive at a reliable esti™® omP" 
The correlation of the new factor o avets 
hension (.55) is higher than that (52) p 
sentence length to comprehension r 6) ge 
reported in the writer’s earlier a y e 
other words, concreteness appear? it cha? 
what more important for readability 
use of short sentences. 
, An amusing illustration of the 
is furnished by Hayakawa (8), f fou 
his "abstraction ladder" by way ° tio 
ments on different levels of abst™™° 


cal theory 
Dey 


cen g ex” 


ne 
twee This 


Measuring the Level of Abstraction 


Plication of the new measure to these examples 
shows the following: 


Ratio Percent- 
of age of 
“definite “definite 
words" words" 
Mrs. Plots makes good potato 
Pancakes, 3in 6 50 
Mrs, Plotz is a good cook. 2in6 33 
e: women are good cooks. lin5 20 
he culinary art /ias reached a 
high state in America. lin 10 10 


How to Use the Formula 


For practical application, the directions for 
using the formula may be stated as follows: 

To measure the level of abstraction of a 
Blven piece of writing, go through the follow- 
Ing Steps: 


sob 1. Unless you want to test a whole 
p Ce of writing, take samples. Take enough 
n ples to make a fair test. Don't try to pick 
vem ' or "typical" samples. Go bya strictly 
thing al scheme. For instance, take every 
Sam ,Paragraph or every other page. Each 
graph should start at the beginning of a para- 


t 2. Count the words in your piece of 
sample or, if you are using samples, take Ero 
it e and count each word in it up to 100. 
ne Nt contractions and hyphenated words as 
sepa Ord. Count as words numbers or letters 
Seated by space. " 
Count the following "definite 
Uns Count each of these words only once. 
nt as words all units separated by white 
? the examples, "definite words" are 
dn Count all names of people—that is, 
Culin r nouns with natural gender, either mas- 
Part BE feminine, Count all x thana 
e na i ing titles, etc., 
S po E de AU President Harry 
o nan (count 4). ND 
Modif, t names of people used as adjectives ie 
brother, natural-gender nouns (e.g. the Smi : 
ergo, the Dolly sisters), but do not conn 
Noung -Names used as adjectives to mo ify 
Mote. Without natural gender (e.g. the For 


t 
(2) Company, the Washington Monument). 
aal gender’ all common nouns that have n t- 


der, ei i inine. 
m Y, either culine or femi 
thesis: father, maher. iceman, actress. „Count 
cag tors also when used as adjectives to 
an the gender of another noun (eg. 
doctor, bull elephant, girl athlete), but 


387 


not count them when they do not indicate 
gender (e.g., fellow workers). 

Do not count common-gender nouns like 
teacher, doctor, employee, spouse. 

(3) Count all nouns that indicate a specific 
time on the clock or the calendar; e.g., the 
names of the months, the seasons, the days of 
the week, the words morning, noon, afternoon, 
evening, night, day (in the sense of daytime), 
sunup, sundown, today, tonight, yesterday, to- 
morrow, and the words breakfast, lunch, dinner, 
supper when used to indicate time. Count 
these words also when used as adjectives to 
specify time (e.g., December day, fall season, 
lunch hour), but do not count them when used 
as adjectives without relation to time (e.g., 
Thursday club, dinner companion, Sunday 
suit). 

(4) Count all numeral adjectives and all 
nouns modified by numeral adjectives. Count 
also the words first, next, last, and such words as 
double, pair, half, triple. Examples: 27 words. 
54 percent. Six letters. The next day. The 
last moment. But don't count nouns that are 
not directly modified by numerals, e.g., thou- 
sands of people, a ten-year-old house. 

Count the word one in the sense of ‘‘single,” 
but not when used as an indefinite pronoun or 
as part of "no one," "any one," or "some one." 
Examples: Count ome in "one fine morning” 
and in "the pretty one,” but not in "one has 
one's doubts." 

Count the word once in the sense of “a single 
time," but not in the phrase "at once." Ex- 
amples: Once upon a time. Je succeeded once. 

Count the words other and another only in 
the sense of “second” or “one more." i 

(5) Count all finite verb forms—that is, 
verbs in the first, second, or third person and 
the present, past, or future tense. In verb 
forms with auxiliary verbs, count the auxiliary 
rather than the main verb. Examples: He 
came and went. We have considered. You 
should have declined. Let us pack up and 
go. It was debated and voted down. 

Exception: Do not count the verb “to be" 
when used as copula, that is, as a link between 
subject and predicate. Do not count ‘‘to be" 
as copula in any form, with or without an 
auxiliary verb. Examples: J was sick. You 
should have been there. It might be fun. He 
ought to have been careful. 

(6) Count all present participles (-izg) when 
used as part of the progressive tense (to be 
-ing). Examples: He was running. I am going 
tolook. You should have been working, 

(7) Count all personal pronouns except in- 
definite "it," and all reflexive pronouns, formed 
with -self or -selves. Examples: It is a fact 
that you and J are not related. It was Mary 
herself. 

(8) Count the words here, there, then, aiid 


388 


mow (except indefinite "there" in "there is, 
a " etc). 
Rz one i od who, whom, when, where, 
nd how. 
aS Count the words this, that, these, and 
those; each, same, and both; and nouns modified 
by them. "T 

(11) Count the words what and which (inter- 
rogative) and nouns modified by them, but not 
the word "which" when used as a relative 
pronoun. Examples: Which way are you going? 
But: The car which I bought. 

(12) Count all possessive pronouns (my, 
your, his, her, its, our, their, etc.), all nouns in 
the possessive case ending in /s or s’, the word 
whose, and all nouns modified by these posses- 
sives. Examples: Our modern civilization, its 
recent development, whose business, journey's 
end. 

But do not count possessive cases of pro- 
nouns not otherwise counted, e.g., ‘one’s ideas,” 
"someone else's hat." 

(13) Count the word that when used as a 
relative pronoun, but not when used as a con- 
junction. Examples: Remember that you are 
Sick. It was the humidity that did it. 

(14) Count the words yes and no (used as 
answer). 

(15) Count all interjections. 

(16) Count the definite article /he and the 
noun modified by it, but only if that noun is a 
single word not otherwise modified. Examples: 
I missed the bus. We called the doctor. It was 
the truth. But: I missed the green bus. 


We 
called the eye specialist. Jt was the truth, 
plain and simple. They gave me the room I had 
last time. 


1 It was the beginning of the end. We 
live at the end of the village, in the yellow house 
at the turn of the road, i 
Do not count the word “the” 

ing a noun that is to be counted 

the other definitions. Examples: He came in 
the afternoon. The only one there was the boy. 
, Do not count the word “the” when modify- 
ing adjectives or noun-adjectives, particularly 
proper noun-adjectives referring to national- 
ity, race, etc. Examples: Their team was the 
best. We sat in the dark. What's the good 


of it? The Scotch are thrift 
arrested. Y. The Negro was 


Step 4. The number of "definit d 
your 100-word sample (or the etii 
age of "definite words" in all your samples or 
the whole piece of writing tested) indicates th 
level of abstraction. The typical relation b i 
tween the percentage of "definite words" a 
level of abstraction is shown in Table 3 ie 

To test the readability of a p 
go through the following additi 

Step 5. Count the syllable: 
word samples or, if you are 
piece of writing, com 
bies per 100 words, 


when modify- 
under one of 


iece of writing, 
onal Steps: 
S In your 100. 


testing a whole 
pute the number of sylla- 


If in doubt about syllabi- 


Rudolf Flesch 


Table 3 


Typical Percentages of “Definite Words” 


Percentage of 


Feed “Definite Words" 


Abstraction 


Highly abstract 
Fairly abstract 
Fairly concrete 
Highly concrete 


Up to 20 
20 to 30 
30 to 45 
Over 45 


Count 


cation rules, use any good dictionary: , figures 


the number of syllables in symbols ant ity real 
according to the way they arg norma Dy oF 
aloud, e.g., two for $ (“dollars”) and ase con- 
1918 ("nineteen-eighteen"). Ifa parimate 
tains several or lengthy figures, your de these 
will be more accurate if you don’t inele iyon 

figures in your syllable count, In a onding 
sample, be sure to add instead a corresp ro 


number of words in your syllable get first 
save time, count all syllables except nda 

in all words of more than one syllable 3 is 
the total to the number of words tester 

also helpful to “read silently alou 
counting. - b 

Step A Find your readability score i pe 
serting the number of ‘definite ke o yk 
hundred words (dw) and the num) ) in 
\ 


sngth, 
lables per hundred words (word lengt? 
the following formula: 


e by im 


I. 
R = 168.095 + .532 dw — .811 %' 


The readability score will put yo al y bi 
writing on a scale between 0 dpa: persone 
readable) and 100 (easy for any pem rhe P? 
Readability scores will tend to follo 
tern shown in Table 4. 


Table 4 
Pattern of Readability Scores ite 
apelin 
or? 
Syllables "e n 
re DES 1 0 
Readability Description por 
Score of Style Words 5 
4 
0 to 30 Very difficult — 192 A 
30 to 50 Difficult p 29 
50 to 60 Fairly dificult — 197 m 
60 to 70 Standard d 43 
70 to 80 Fairly easy T 50 
80 to 90 Easy 1 3 
90 to 100 Very easy 1% n 
Td. 
" of * as" 
Or you may want to test à pie ding ° 


separately for concreteness and `" 
f so, do this after Step 5: 


Measuring the Level of Abstraction 


Alternative Step 6: Figure the average sen- 
tence length in words for your piece of writing 
or, if you are using samples, for all your samples 
Combined. In a 100-word sample, find the 
sentence that ends nearest to the 100-word mark 
—that might be at the 94th word or the 109th 
word, Count the sentences up to that point 
and divide the number of words in those sen- 
tences by the number of sentences. In count- 
us sentences, follow the units of thought rather 
i A the punctuation: usually sentences are 
marked off by periods; but sometimes they 
n marked off by colons or semicolons—like 
these. But don't break up sentences that are 
Joined by conjunctions like and or but. 

Alternative Step 7: Find your "reading ease” 
Score by inserting the number of syllables per 
E words (word length, wl) and the average 

ntence length (sl) in the following formula: 


RE 


2. ("reading ease") . 
= 206.835 — .846 wl — 1.015 sl. 


ait he “reading ease” score will put your piece 
N Writing on a scale between 0 (practically 
Nreadable) and 100 (easy for any literate 
Person), 7 


Sample Application 


To show the application of the new test, 
m passages dealing with the subject of 
"RE will be used. Passage A, exemplifying 
s cighly concrete style, is Test Lesson No. 37 

™ Book IIT of the Standard Test Lessons t 

€ading (13) that were used as criterion. 
assage B was taken from Courts on Trial by 
ms Frank (6). It is an example of 
illu ing on the middle level of abstraction, 
in z rating the more abstract generalizations 

Ne first paragraph by more concrete ex- 
™ples in the second. Passage C was taken 
by L. L. 


t à 
Ww The Next Development in Man 


389 


“Definite words” in the three passages are 
italicized. 


Passage A: 


The children were telling about their Christmas 
vacations. 

“We went to Kansas," said Jack. “One day 
when we were skating on the lake some of the 
boys cut a hole in the ice, struck a match and a 
fire blazed right up out of the hole for two or 
three minutes." 

“Oh, oh!" said all the others, “that couldn't 
be true. Water doesn't burn." 

“But itis true," said Jack. “I saw it." 

They turned to the teacher to see what she 
would say and she explained this very strange 
happening. It seems there are natural gas 
wells under the lake which send the gas bubbling 
up through the water where it is caught in large 
pockets under the ice. 

“So you see," said the teacher, “when a hole is 
cut the escaping gas will burn if lighted." 


Passage B: 


No means then, have as yet been discovered 
or are likely to be discovered, for ascertaining 
whether or to what extent the belief of the trial 
judge about the facts of a case corresponds to 
the objective facts as they actually occurred, 
when the witnesses disagree, and when some of 
the oral testimony, taken as true, will support 
the judge's conclusion. In other words, in such 
a case there is no objective measure of the 
accuracy of a judge's finding of the facts. There 
exists no yardstick for that purpose. 

In a "contested" law suit, therefore, with 
the witnesses in disagreement, usually no one 
can adequately criticize the trial judge's fact- 
finding. If, at the end of the trial, the trial 
judge says that Jones hit Smith, or that Mrs. 
Moriarity called Mrs. Flannagan a liar, or that 
old widow Robinson was insane when she made 
her will, or that Wriggle used fraud in inducing 
Simple to sign a contract—the judge's word 
goes. And the same would be true if, in most 
of those instances, the trial judge had found 


ihe (21 ighly abstract 
i and shows a highly i ; 
Philosophical nae of truth. exactly the opposite to be //ie facts. 
Table 5 
Analysis of Three Sample Passages 
of Syllables per Readability 
vies Words” 100 Words Score 
Passage A 53 (highly concrete) 121 98 (very easy) 
assage B: n m 
"i irly abstract) airly difficult) 
First paragraph 24 (fairly s tran 


Second paragraph 
Both paragraphs 


37 (fairly concrete) 
31 (fairly concrete) 
17 (highly abstract) l 


151 62 (standard) 
159 48 (difficult) 


Passage C 


390 Rudolf Flesch 


Passage C: : 

Truth is thought which conforms to the form 
of the whole. Conformity to the whole is the 
criterion. The unitary truth is that which 
conforms to the whole process of which it is 
a part. The truth is a form embedded in 
the whole complex of processes in the human 
organism and its environment, symbolizing and 
organizing them. A particular truth may not 
represent the entire structure of a situation but 
only those aspects which are relevant to thought 
at a given stage in its development. The truth 
is a system of symbols whose structure conforms 
to the whole pattern of feeling, thought, and 
action, and integrates all the processes which 
link the reception of stimuli and the molding 
of the ultimate responses. This is not a prag- 
matic criterion, since unitary truth does not 
merely serve special needs but unites the whole 
system in a conviction which is at once emo- 
tional, intellectual, and practical. 


A comparative analysis of the three passages 
is given in Table 5. 

Considerable experience with practical ap- 
plication of the new test has shown that an 
untrained person can acquire reasonable famili- 
arity with this test after ten to twenty applica- 
tions and will then be able to count the “defi- 
nite words” in a 100-word sample in one to two 
minutes. The application of the rest of the 
readability test will ordinarily take another 
one to two minutes. 


Received September 26, 1950. 
Early publication. 


References 


1. Boder, D. P. The adjective-verb quotient: a con- 
tribution to the psychology of language. Psy- 
chol. Record, 1940, 22, 310-343. 

2. Curme, George O. Principles and practice of Eng- 


aed promos New York: Barnes & Noble, 


3. 


12. 


13. 


14. 


15. 


16. 
17. 
18. 


19. 


20. 


21. 


. Flesch, Rudolf. 


. Flesch, Rudolf. 


Dale, Edgar, and Chall, Jeanne S. A formula ior 
predicting readability. Educ. Res. Bull., Ohio 
State Univ., 1948, 27, 11-20, 28. 

The art of readable writing. 

York: Harper & Brothers, 1949. 

A new readability yardstick. J- 

appl. Psychol., 1948, 32, 221-233. 


New 


. Frank, Jerome. Courts on trial; myth and reality 


in American justice. Princeton Univ. Pr., 1949. 


. Fries, Charles C. One stylistic feature of the 1611 


English Bible. Fred Newton Scott A nniversary 
papers. Univ. of Chicago Pr., 1929, 175-187. 


. Hayakawa, S. I. Language in thought and action. 


New York: Harcourt, Brace & Co., 1949. 


. Jespersen, Otto. Essentials of English grammar. 


New York: Henry Holt & Co., 1933. 


. Johnson, Wendell. People in quandaries; tlie seman- 


tics of personal adjustment. New York: Harpet 


& Brothers, 1946. 


. LaBrant, Lou L. A study of certain language de- 


velopments of children in grades 4 to 12 inclusive- 
Genet. Psychol. Monogr., 1933, 5, 387-491. 
Lorge, I. The Lorge and Flesch readability for- 
mulae: a correction. Sch. & Soc, 1948, 67, 
141-142. rl 
McCall, W. A., and Crabbs, Lelah M. Stander, 
test lessons in reading. Books II, III, IV, and d 
New York: Bur. of Publ., Teachers Coll., Colum 
bia Univ., 1926. vids 
Perrin, Porter G. Writer's guide and index lo Eng 
lish. Chicago: Scott, Foresman & Co., 1 Lily’ 
Piaget, Jean. The child’s conceplion of causa 
New York: Harcourt, Brace & Co., 1930. hild. 
Piaget, Jean. Judgment and reasoning in the ch 
New York: Harcourt, Brace & Co., 1928. hild. 
Piaget, Jean. The language and thought of the € 
New York: Harcourt, Brace & Co., 1926. ue 
Séchéhaye, Albert. Essai sur la structure logi 
de le phrase, Paris: Champion, 1926. 


r 
Shannon, C. E. A mathematical theory g 
munication. Bell System Tech. J- ee 

379-423, 623-656. uni- 


Weaver, Warren. The mathematics of Ws 
cation. Scientific American, July 1949 
No. 1, 11-15. 

Whyte, Lancelot L. The next developmen 
New York: Henry Holt & Co., 1948. 


Lin mat: 


Personality Characteristics of Embalmer Trainees 


Daniel N. Wiener and Werner Simon, M.D. 


Veterans Administration, Fort Snelling, Minnesota * 


Occasionally clients are counseled who have 
a strong desire to be embalmers or undertakers. 
This vocational objective is seldom suggested 
toa client unless he shows high original motiva- 
lion to enter this field. 

In a regional veterans’ counseling program 
thirty-six veterans indicated their desire to 
become embalmers and began vocational 
training. The vocation is recognized through 
Standards set by the State of Minnesota, and 
the organized training program includes two 
Years of college work prior to specialized 
training. 

Psychiatrists and psychologists, in discussing 
the Occupation of embalming, frequently raise 
the question as to personality traits of those 
attracted to this field. In an attempt to in- 
Vestigate this question a study has been made 
of the personality characteristics of veterans 
Who indicated their interest in this work, in 
Order to determine whether they are obviously 
Unique in personality. This is but one of 
Many occupations where the technical skill 
and ‘social attitudes involved may be less im- 
Portant factors to the counselee than his 
Personal peculiarities in motivation and needs. 


Literature 

terature of 
ssible prior 
y kind has 


us a review of the recent li 
v) Ctr and psychology: for po 

Search in this area, no study of an. 
dee found, although there are a number of 
Nalyses on concepts of death, as well as 


i Veral case reports on necrophilia. The 
ate has only secondary 
fficient to 


aise f responsiveness to 
tile unds for medical research, 
Paralysis, tuberculosis), and in 


* 

P i P 

Medica Sed with permission of the Dean’s Committee, 
School, University of Minnesota. 


391 


rituals of funerals. In relation to the present 
problem this literature on concepts of death 
suggests an investigation of the personality 
structure of those who seem to lack this 
common fear as shown by their work with 
the dead (2, 3). 

Case reports on necrophilia stress patho- 
logical personality organization, especially in 
the sexual sphere. Brill (1) reports the case 
of a thirty-year-old single male who was 
primarily a passive homosexual and whose 
necrophilia represented a by-product of his 
pathological sexuality. He gave a history of a 
severe phobia with regard to corpses since 
early childhood, which was apparently over- 
come by touching a corpse. This experience 
changed the former phobia into the opposite, 
a strong attraction for corpses. As his fascina- 
tion for corpses increased, he became the paid 
assistant of an undertaker, who spoke highly 
of his diligence and skill. During analysis he 
confessed that it was his morbid desire to play 
with corpses which prompted him to work for 
the undertaker. 

Rapoport’s (4) patient was a fifty-year-old 
man who was arrested because he was found 
kissing and touching a female corpse in a 
funeral parlor. He began his practice of 
visiting dead bodies six years after the death of 
s mother, and a ritual of visiting funeral 
s two or three times a week ensued. The 
corpses were usually female, middle-aged or 
elderly. Rapoport points out that the litera- 
ture contains very little on the subject of necro- 
philia, but it seems to him unlikely that the 
number of cases is as small as the absence of 
literature suggests. 


hi 
parlor 


Method s 


It seemed desirable to determine common 
impressions of psychologists and psychiatrists 
as to what personality peculiarities, if any, em- 
balmers might have. This was done by asking 
both vocational and personal counselors, as 
well as psychiatrists, for their impressions, and 
tabulating their responses. > 


392 


The actual personality characteristics of ps 
balmer trainees were measured by use of the 
Minnesota Multiphasic Personality Inventory, 
with the recently developed subtle and ahs ious 
keys (5) included. A group of thirty-six vet- 
erans were studied, all of whom indicated a 
desire to be embalmers and made this their 
training objective after counseling. The con- 
trol group consisted of one hundred veterans 
who came to the same guidance centers for 
counseling during the same period of time, but 
who decided upon other vocational objectives. 
All were tested with the MMPI. 

The use of the term “embalmer” may be 
distinguished from that of “undertaker.” The 
undertaker receives training first in embalming, 
followed by additional training involving more 
of the business and social details of funeral 
arrangements. 


Results 


Thirteen vocational and personal counselors 
were asked to write briefly their impressions 
of the personality characteristics of men in- 
terested in embalming. Thirty-six responses 
were elicited, with a range of one to five 
reactions from each counselor. Table 1 indi- 


Table 1 


Personality Characteristics of 


Embalmers According to 
Thirteen Personal and Vi 


ocational Counselors 


Traits Frequency 
Psychopathic 11 
High Drive 9 
Feminine 5 
None, Stable 5 
Introvertive 3 
Scientific Interest 2 
Depressive 1 

36 


cates the results of the tabulati 
sponses. Relatively shallow 
well as aggressiveness, 
pressions, with femin 
suggested. 

Five psychiatrists 
similar question, 
items. 


on of the re- 
emotionality, as 
are the outstanding im- 
ine characteristics also 


5 Bave ten responses to a 
With a range of one to three 
On tabulation, responses classified as 
"extiovertive," “compulsive,” "passive-aggres- 


Daniel N. Wiener and Werner Simon 


Table 2 


i » = 36) 
Means and Standard Deviations of Embalmer UN 
and Control (N = 100) Groups in Age anc 
Education, and on MMPI 


Mean prt 
Age Emb. 25.72 n 
Con. 24.48 à 
1.55 
Education Emb. 12.00 163 
Con. 11.68 n 
my" 1.85 
K Emb. 5645 9.60 
Con. 55.15 
x 4.90 
F Emb. 50.20 580 
Con. 50.95 = 
P 8.55 
Hypochondriasis* Emb. 54.25 055 
Con. 51.00 : 
5 10.05 
Depression** Emb. 49.65 11.40 
Con. 52.35 - 
5 11.75 
D-Subtle Emb. 48.70 23 
D-Obvious Emb. — 5130 me x 
Hysteria Emb. 55.90 8.55 
. Con. 55.35 à 
rom 9.5 
Hy-Subtle Emb. 57.70 050 
Hy-Obvious Emb. 52.05 3.40 
Psychopathic Dev. Emb. va 11.80 
Con. x2 
= 8.95 
Pd-Subtle Emb. 55.50 $15 
Pd-Obvious Emb. 50.35 s70 
Femininity** Emb. 54.80 7.45 
Con. 56.90 4.50 
Paranoia* Emb. 47.85 8.55 
Con. 50.75 10 90 
Pa-Subtle Emb. 51.75 645 
Pa-Obvious Emb. 44.00 6.15 
Psychasthenia Emb. 55.10 10.10 
Con. 5444 3.00 
Schizophrenia Emb. 56.35 9.45 
Con. 56.14 11.90 
Hypomania** Emb. 59.55 145 
Con. 56.80 41-70 
Ma-Subtle Embi 61.30 980 
Ma-Obvious Emb. 51.75 m 
2^ 
* Difference between means is significant ? Jo 
level. 6 


* 


m . eenificant 
l j Difference between means is signific 
evel. 


Personality Characteristics of Embalmer Trainees 


Sive" and “none” were given twice. There 
yere single responses for: “depressive” and 
psychopathic.” 

, Table 2 gives the means and standard devia- 
tions for the embalmer trainees and control 
groups, together with an indication of the 
Significance of the difference between the 
means of each MMPI Scale. 

In age and education the embalmer and 
Control groups were not significantly different 
from each other. In Hypochondriasis, De- 
Pression, Femininity, Paranoia and Hypo- 
Mania, however, the means do differ slightly 
from each other with some likelihood of signifi- 
cance. Most prominent is the slightly greater 
tendency toward Hypochondriasis in the em- 

almers, Hypomania is somewhat less ele- 
Vated, while in Depression, Femininity and 
aranoia the embalmers are slightly lower than 
the control group. Most of the subtle scores 
are higher than the obvious scores, à character- 
IStic of successful groups in previous studies (6). 


Summary 


These test results indicate that there are no 
Strongly deviate personality characteristics 
mong the embalmer trainees. Elevations on 
ca Subtle scores, along with the Hypomania 
i ore, seem to characterize successful groups 
N other areas (6). The fact that the scores for 


393 


several other scales are somewhat lower for 
embalmers than for the control group suggests 
the same conclusion. 

The strongest possibility for an interpreta- 
tion of uniqueness lies in the elevated Hypo- 
chondriasis. Tendencies toward over-concern 
with personal health may, through projection, 
lead to an interest in the dead and in preparing 
bodies for burial. 

Negatively, the results of this study do not 
substantiate tendencies toward shallow emo- 
tionality, femininity, compulsiveness or necro- 
philia, which a strong interest in dead bodies 
might suggest and which both counselors and 
psychiatrists tend to indicate. 


Received January 23, 1950. 


References 


1. Brill, A. A. Necrophilia. J. crim. Psychopath., 
1941, 2, 433-443. 

2. Caprio, F. S. A psycho-social study of primitive 
conceptions of death. J. crim. Psychopath., 
1943, 5, 303-317. 

3. Orlansky, H. Reactions to the death of Pres. 
Roosevelt. J. soc. Psychol., 1947, 26, 235-266. 

4. Rapoport, J. A case of necrophilia. J. crim. Psy- 

chopath., 1942, 4, 277-289. k 
Wiener, D. N. Subtle and obvious keys for the 


5. 
Minnesota Multiphasic Personality Inventory. 
J. consult. Psychol., 1948, 12, 164-170. , 
6. Wiener, D. N. A control factor in social adjust- 
ment. J. abnorm. soc. Psychol. (In press.) 


The Guilford-Zimmerman Temperament Survey and Certain 
' Related Personality Tests * 


Claudia Gilbert 
The Pennsylvania State College 


As a result of an accumulation of evidence 
obtained by testing supervisors in industrial 
plants in Pennsylvania, the Personnel Service 
Division of The Pennsylvania State College is 
now using the Bernreuter Personality Iventory 
and the Guilford-Martin Personnel Inventory 
in test batteries for the selection of candidates 
for supervisory positions. Consequently, the 
publication of a new personality test, the 
Guilford-Zimmerman Temperament Survey, 
which appears to incorporate in one test the 
traits now being measured by a combination of 
the Guilford-Martin and the Bernreuter, is of 
great interest to the Division. 

The purpose of the present investigation was 
to set up an experimental situation which 
allowed a comparison of the Guilford-Zimmer- 
man, Guilford-Martin, and Bernreuter, and an 
analysis of the resulting intercorrelations, in 
order that the Personnel Service Division might 


have a starting point for further experimenta- 
tion with this new test. 


The Tests 


Since the purpose of this study was to 
make a practical comparison of the Bern- 
reuter with the Guilford-Zimmerman, it was 
decided to use the three most widely applied 
Bernreuter scales—B1-N-Neurotic Tendenc 
(reversed and oriented as Stability), B2-S- 
Self-Sufficiency, and B4-D-Dominance—these 
also being the scales used by the Personnel 
Service Division in selection for su 
trainees. 

The Guilford series of personality tests are 
the result of an extensive series of factor 
analyses that have identified a number of 
personality factors. Of the original Guilford 
series of three personality scales, the Guilford- 
Martin Personnel Inventory measuring Objec- 


pervisory 


* This material is derived fro 
submitted in June, 1950, in par 
requirements for the M.S. degre 
State College. The author is dee 
Griffin and Dr. Kinsley R. Smit 
criticism. 


m the author's thesis 
tial satisfaction of the 
e at The Pennsylvania 
ply indebted to Charles 
h for their guidance and 


394 


tivity, Agreeableness, and Cooperativeness bee 
been found to offer good possibilities 1n z 
dustrial situations. The authors have a 
lished studies which give an indication of i : 
validity, particularly in regard to the selectio 
of supervisors (5). 

The new ee Tempa 
ment Survey measures ten Guilford wee 
termed GRASE-OFTPM. These ten dt 
measured with 300 items by the Guilfor i 
Zimmerman, have been arrived at by E 
of condensations and omissions of trait pen 
where intercorrelations are sufficiently d 
thus purporting coverage in a more ee s 
fashion of the 13 factors measured by the pee 
original Guilford inventories. Rather ine 
being composed of questions, the new € a 
is made up of declarative statements, 5° ` 3, 
instead of “Do you get things done in a hurry 
it is, “You get things done in a hurry: fan 
validation studies of the Guilford-Zimme no a 
Temperament Survey are known to the au 


Procedure 


The subjects for the study were the member 
of two introductory psychology classes at $ 
Pennsylvania State College. Of these, am 
male population of 64 was used. Scatter 
grams were prepared in order to ascerta ac 
linearity of the data. The Pearson P. 120 
moment formula was employed and, in 2? 
correlations were determined. 


Results 


1 
e 
Intercorrelations of the Traits. In Tab! in 
are presented the coefficients of correlati? the 
terms of product moment 7’s, bet 
Guilford-Zimmerman factors and their coti 
parts in the Guilford-Martin. O-Ob€6, 78 
in the Guilford-Zimmerman correlate arti 
with O-Objectivity in the Guilford- yma? 
F-Friendliness in the Guilford-Zimm^" the 
Correlated .76 with Ag-Agreeableness ipd He 
Guilford-Martin, and P-Personal Relat! 


rj 


The Guilford-Zimmerman Temperament Survey 395 


Table 1 


Coefficients of Correlation * 


Guilford-Zimmerman 


> 
& d 
H = 
S 2B 
n n 2 eo 
Biz 
5 £2365 3 
3 d Bd g 
S - 
TETE 
S 8 
o B m 
or A A < 
—_ LV V o Sa 
Gui š 5 P 
— O-Objectivity 78 
artin Ag-Agrecableness -76 


Co-Cooperativeness 


B 
Bi BI-N (rev.) Stability i 
cuter — B4-D.Dominance s 
EET AFOSR a eee 


* 
All r’s significant at the .01 level. 


the Guilford-Zimmerman correlated .82 with 
O-Cooperativeness in the Guilford-Martn. 
lso in Table 1 are presented the coefficients 


betw, 
n ween the Guilford-Zimmerman factors, 
~Motional Stability and A-Ascendance, and 
B4-D- 


“alts BIN (reversed) Stability and B. 
9minance in the Bernreuter. E-Emotional 
Ite ility in the Guilford-Zimmerman corre- 
d .80 with B1-N (reversed) Stability in 
ernreuter, and A-Ascendance In the 
omer Zimmerman correlated 80 with B4-D- 
; ance in the BernreuteT. 
pua Analysis of I mlercorrelations. In ex- 
e ng the comparison between the factors 1n 
t Guilford-Martin and their counterparts in 
Uillord-Zimmerman to include compari- 
Gui, ith the remaining seven factors in E 
Be, or d-Zimmerman as well as with | 
the cuter traits, several correlations fell into 
eno gory of r’s generally. accepted re 
Cong], E high to very high relationship bs 
tivi ation of .74 was found between 0-0 E 
teo the Guilford-Martin ane -is 
there ed) Stability in the Bernreuter. Also, 
Dm, S a correlation of .74 between this "e 
nq t, -Objectivity in the Guilford- MP 
Zim, "Emotional Stability in the Guilfor 
Terman, 
Guo aparing the Bernreuter traits with the 


: n 
factors, the only correlation betwee 


Guilty 


these measures which could be accepted as 
high, aside from those already discussed, was 
the .74 coefficient between B1-N (reversed) 
Stability in the Bernreuter and O-Objectivity 
in the Guilford-Martin. 

The intercorrelations found in this study 
when comparing the Guilford-Zimmerman 
traits to each other, in terms of product 
moment 7’s, indicated similar interrelation- 
ships to those indicated by the tetrachoric 
coeflicients published in the Guilford-Zimmer- 
man manual. There was one very high co- 
efficient, however, which fell well above our 
arbitrary standard for a high correlation. 
This was a .84 between E-Emotional Stability 
and O-Objectivity. 

The high intercorrelation between B1-N 
(reversed) Stability and O-Objectivity in both 
the Guilford-Martin and the Guilford-Zimmer- 
man brings to mind the high intercorrelation 
of —.95, as reported in the Bernreuter Manual, 
between B3-I-Introversion and B1-N-Neurotic 
Tendency. This suggests, as does the data 
ted in this paper in terms of correlation 
coefficients, that on the one hand BI-N, 
oriented as Stability and B3-I, oriented as 
Extraversion, measure the same thing, and on 
the other hand, that this is the same trait as 
the Guilford-Zimmerman and the Guilford- 
Martin factor O-Objectivity. The high cor- 


presen 


rela 
E-Emotional Stability in the Guilford-Zimmer- 


man again suggests that these are not separate 
traits. . 

To sum up; the evidence presented suggests 
at the various personality inventories which 


th à itories 
are measuring traits named Objectivity (or 
Extraversion) and Emotional Stability, are 
primarily measuring only one factor. 


A Comparison of B2-S-Self-Sufficiency with 
the Guilford Factors. There was no indication 
onship between B2-S-Self-Sufliciency 
and any of the Guilford factors. The highest 
correlation was, in fact, .37, which though 
significant at the .01 level of confidence, ap- 
peared to be more noteworthy for its lowness 
than for its highness. However, such a result 
seems understandable, since up to this point 
the trait Self-Sufficiency has not been named 
by Cattell, Guilford, or Thurstone, and ap- 
parently has not been isolated in factor 
analysis. Also, it remains the most indepen- 


of a relati 


tion of .84 between O-Objectivity and' 


396 


dent measure, in terms of intercorrelations, in 
the Bernreuter Personality Inventory. 

On the other hand, there could be another 
explanation. An inspection of the scatter 
diagrams of B2-S with the Guilford-Zimmer- 
man factors suggests that in regard to several 
factors, especially A-Ascendance, there may 
be curvilinearity rather than linearity of rela- 
tionship. In terms of the present research, 
this means the relationship between the Guil- 
ford factors and B2-S may have been masked 
when studied in terms of the Pearson r, and 
significantly higher correlations might have 
appeared if the e/a correlation were applied. 
To check this assumption the ela between 
B2-S-Self-Sufficiency and A-Ascendance was 
computed. The chi-square test for linearity 
of regression indicated, however, that the 
departure of curvilinear relationship from 
linear relationship was not significant at the 
.05 level. 


Summary 


A study was made of the relationships be- 
tween the factors measured by the Guilford- 
Zimmerman Temperament Survey and the 
traits measured by two previously validated 
inventories, the Guilford-Martin Personnel In- 
ventory and the Bernreuter Personality In- 
ventory. 'The subjects for the study were 
members of two introductory psychology 
classes at The Pennsylvania State College. Of 
these, an all male population of 64 was used. 
Scatter diagrams were prepared in order to 
ascertain the linearity of the data. The Pear- 
son product moment formula was employed 
and, in all, 120 correlations were determined. 


1, O-Objectivity in the Guilford-Zimmerman 
has a high positive correlation (r = .78) with 
O-Objectivity in the Guilford-Martin. 

2. P-Personal Relations in the Guilford- 
Zimmerman has a high positive correlation 
(r = .82) with Co-Cooperativeness in the 
Guilford-Martin. 

3. F-Friendliness in the Guilford-Zimmer- 


man has a high positive correlation (r = .76) 
with Ag-Agreeableness in the Guilford-Martin 


Claudia Gilbert 


4. E-Emotional Stability in the Guilford- 
Zimmerman has a high positive correlation 
(r = .80) with B1-N-Neurotic Tendency re- 
versed and oriented as Stability, in the 
Bernreuter. i 

5. A-Ascendance in the o op 
man has a high positive correlation (r = - 
with B4-D-Dominance in the Bernreuter. f 

6. There is no indication of a high degree : 
relationship between B2-S-Self-Sufficiency e- 
the Bernreuter and any of the Guilford pe^ : 

7. The high positive correlation between th 
Guilford-Zimmerman factors O-Objectivity s 
E-Emotional Stability of .84 suggests e 
these are not separate traits. There is furt imt 
evidence for this in the high positive rore 
between O-Objectivity in the Guilford-Mar d- 
and E-Emotional Stability in the ew 
Zimmerman of .74, and between O-Object a 
in the Guilford-Martin and B1-N-Neuro in 
Tendency reversed and oriented as Stability 
the Bernreuter of .74. 


Received June 26, 1950. 
Early publication. 


References 


- Allport, G. W. Neurotic personality and ls 1 

self-expression. J. soc. Psychol., 1930, b Y yen- 
2. Bernreuter, R. G. Manual for the Personne 
lory. Stanford University, California: >? 


University Press, 1935. :&eation ol the 
3. Cattell, R. B. Confirmation and canne pmetrikas 
primary personality factors. Psyco 


1947, 12, 197-220, 

* Garrett, H, E. Statistics in psychology an 
lion. New York: Longmans, Green an j 
pany, 1948. -— 4L Marti! 

5. Guilford, J. P. Manual for the Guilford in: 

Personnel Inventory. Beverly Hills, e 
Sheridan Supply Company, 1943. aane 

- Guilford, J. P., and Zimmerman, W. 5- ui Surie 
for the Guilford-Zimmerman Tem porame” y Co" 
Beverly Hills, California: Sheridan Supl 


cd 
d edit 
an son’ 


pany, 1949, in neuro 
7. Mosier, C. I. A factor analysis of eer Ee ane 
" tendencies, Psychometrika, 1937, 2, ver 


- Super, D. E. The Bernreuter Personality 194^ 


tory: a review of research. Psychol. ee " 
39, 94-125. tem 


«rant 
9. Thurstone, L. L, The factorial descrip’ i 
Perament. Science, 1950, 111, 454-45 


—M———————— 


The Non-Respondent Problem in Questionnaire Research 


W. Leslie Barnette, Jr. 
Department of Psychology, University of Buffalo 


A great deal of social science research is 
> canis | conducted by means of mail ques- 
vici despite wide recognition of pitfalls 
ied >= methodology. It is imperative, 
"Sy y when questionnaire } returns ob- 
Fr on a sample are to be utilized for pre- 
kis e for a population—the usual goal— 
reg nformation be available about the non- 

Pondents. 

‘haem 1 will here be made to review the 
terest, ure which disucsses this topic. The in- 
im reader is referred to three recent 
to mi Ced (5, 6, 12). Most workers attempt 
ing emp non-respondent bias by maximiz- 
Ty nin returns, usually through repeated 
Well toe 'Toops' early reports (9, 10) are 
interested BB this connection. Here he was 
Somet] ed in securing a 100 per cent ie 
When "s 3 which may rarely be attempte 
"Ws original sample is very large. Others 
sing S have tried to cope with the problem by 
he pecial appeals and even bribes. Despite 
ne e that these techniques are effective, 
"sponda left with a sizeable "hard core € 
SSesg = group. It becomes imperative to 
s Stig pL DEDE source of bias here. 
individu always the case that the mort D 
Volvin aus respond. Clark’s study (4) ! A 
LM a mail questionnaire to A. F. L. unio 
of p ers, demonstrated this. The experience 
Curing Present writer has been similar (1), 
useg 8 & Pre-testing of a questionnaire to be 
2m rss a large group of veterans. Here some 
Samp e tionnaires were mailed out to & random 
Was se three weeks later a 50 per cent response 
Just cured with no intervening follow-up. 


ac many replies were received. from 
Bories | în professional and managerial cate 
i-skilled 


Btoup, ^ from those in skilled and sem? 


e verbal 


nq . ; 
Solut;, ". Pini riter, the only rea 
ution ; on of the writer, o te 


ition ‘ 
“Dtirg 's by taking a random sam] 
id pe DOn-respondent group. repeatec 
Mean TSistent follow-ups one aims to secure, 
"elevan & 100 per cent return. One then has 
t data for the detection of non-respon- 


dent bias. The following experience of the 
writer is offered as evidence. 

A large-scale follow-up of some 1,300 veterans 
in the Greater New York area was in progress. 
Replies to the first questionnaire mailing 
totalled 580; a second mailing, some three 
months later, produced 310 additional replies. 
The combined first two waves thus gave 890 
(69 per cent) respondents. The average time 
lapse between the date last seen and the 
questionnaire receipt was two years. All 
addressees had received free educational and 
vocational counseling at the agency in question 
so that they were presumably initsdebt. First 
versus second waves of returns, when checked 
by chi-square, showed no important differ- 
ences. However, despite this comforting fact, 
there still remained a total of 409 (31 per cent) 
non-respondents. 

Tt was not practical to attempt a third con- 
tact of the entire 409. A random sample of 
the group was drawn (every fifth name from 
an alphabetical file, which gave V = 82). It 
was not then a matter of merely sending out a 
third letter and waiting for the subsequent 
wave of returns. Persistent attempts were 
ry with this group by means of inter- 
tcard reminders, further letters, 
telephone calls where feasible and, in some 
cases, personal calls at homes. 

Parenthetically, the unit cost of such a 
edure is prohibitive. For many of this 
total of five separate personal letters 
lved, each time inclosing the ques- 
tionnaire and a stamped, self-addressed en- 
velope; for others, suburban New York 
telephone calls, two and three per person. 
Such could rarely be done for as large a sample 
as the entire 409 recalcitrants. The net result 
s an expenditure of a great amount of time 
and effort and money for, in the main, a rela- 
tively small amount of data in return. 

The spontaneous comments of some of these 
individuals showed that the repeated reminders 
by mail, the tone of which became more sharp 
each time, were effective. One respondent, 


necessa 
vening pos 


proc 
group, à 
were invo 


wa 


397 


398 


apparently ashamed by the persistent re- 
minders, finally returned the completed ques- 
tionnaire, after receipt of the fifth letter, to- 
gether with five three-cent stamps, as if in pay- 
ment for the previously unsuccessful efforts at 
contact. Another respondent, after receipt 
of the sixth letter, returned the completed 
questionnaire with an apologetic note: “... 
thanks for having such patience with me and 
for making me ashamed of myself to a degree 
where I find myself resolving not to procrasti- 
nate in any form or manner whatsoever.” 
Few, however, were as courteous or gave any 
signs of such remodelled behavior. 

In the end, returns were received from all 
but three persons in this random sample (i.e., 
96 per cent replies). The results of this non- 
respondent check need not be reported here. 
Suffice it to say that, in the main, the trends 
observed in the data from the first two waves 
were corroborated. A few important differ- 
ences were evidenced, however. 
mitted a more reliable estimate for the entire 
veteran sample than would Otherwise have 
been possible. These results are presented in 
two forthcoming Papers by the present 
writer (2). 

A subsidiary check for respondent bias (i.e., 
“early” versus “late” returns), which might be 
a cue for the presence of non-respondent bias, 
is occasionally available to the researcher. It 
is never one that might be employed on a large 
scale, however, without involving considerable 
annoyance to addressees, When one is en- 
gaged in sending out a second and third letter 
to non-respondents, it Sometimes happens that 
one draws two returns from the same indi- 
vidual. One may then compare these two 
sets—a type of test-retest technique. The 
present writer met this situation for only five 
respondents; there was a minimum of three 


months’ time lapse between the receipt of the 
first and second questionnaire, 
number of items on 


» which per- 


versus second 
lary data, all of 
1 The respondent was in 


error b 
should have returned thirty cents, Y 50 per cent; he 


W. Leslie Barnette, Jr. 


i inter- 
which involved a salary raise for the in 
vening time. 


Summary 


It is suggested that the real solution Hes 
problem of bias detection, in the event t^ d 
numerically large non-respondent group is 
mains after two or three mail ique. 
follow-ups, is to make up a random mum 2 
this hard core; then to make repeate udi 
persistent attempts at contact. The ve ^ 
is a 100 per cent response. An illustra got 
the application of this technique ES 
given in connection with a large-scale y ting 
of 1,300 veterans where, after two ma -— 
there still remained some 400 am c A 
An incidental check, of the test-retest ty 
is pointed out. 


Received February 28, 1950. 


References 4 

Tn eport on * 

1. Barnette, W. L., Jr. Preliminary ne Vocational 
follow-up of veterans counseled at t y of New 


Service Center, Y. M. C. A. of the s47 Madison 


York. Counseling (Y. M. C. As ira 2) 
Avenue, New York City 17), 1948, Er Tlow-uP 
2. Barnette, W. L., Jr. Report of à gi ersus 
counseled veterans. I, Public er » 
Public Law 16 clients, II. Status kd: s 
training. J. soc. Psychol. (forthcom! i 
3. Bevis, J. C. Economical incentive use 
questionnaire. Publ. Opin. Quarts he 
492—493, : test ot bg 
4. Clark, K. E. A vocational interest 1., 1949 
Skilled trades level. J. appl. Psychol pias 
291-303. lling 
5. Clausen, J, A., and Ford, R. N. Con AS 
in mail questionnaires. J. Amer. sate 
1947, 42, 497-511. 106 
6. Norman, R.H. A review of some problems pat 
to the mail questionnaire techniq z 
Psychol. Measmt., 1948, 2, 235-248- eated 0*5, 
7. Rollins, M. The practical use of repe? Gyo, 
tionnaire waves, J. appl. Psychol 
710-772. uev 
- Sletto, R. Pretesting of questionna 
sociol. Rev., 1940, 5, 193-200. irem 
9. Tops, H. A. Validating the question” 
- J. Person. Res., 1923, 2, 152-169. y-up 
i 9 


rsuit ° 


Toops, H. A, The returns from tolg 
to questionnaires, J. appl. Psycho» p 
92-101, 948-05. il 
Toops, H. A, Questionnaires. PP. if 
Monroe, W. S. (ed.): Encyclopedia 2 


Research, New York: Macmillan, 
vised). 


11, 


Reactions of Veterans to Counseling * 


W. Leslie Barnette, Jr. 
Department of Psychology, University of Buffalo 


T po of a large-scale follow-up of coun- 
as veterans by means of a mail question- 
couns as reactions of such clients to the 
E 1ng process were solicited. These data 
value i pai information regarding the 
point ot counseling as seen from the vantage 
tions M the client and have diverse implica- 
otto or the psychological counselor. At the 
Ems à one-page questionnaire, designed 
curre: nily to elicit facts about the veteran’s 
keji E Status in training some two years after 
tou met two items were placed which directly 
ched on client attitudes to services received. 
RI Study, in which the questionnaire had 
ve m AMEN, produced many letters from 
on ‘eed amplifying the information requested , 
Cally y Questionnaire itself and dealing specifi- 
0 be vith counseling reactions. _ These proved 
Main of such value that, when it came to the 
as. Study, efforts were made to encourage 
Possi of these spontaneous comments as 
Volved mailings of the questionnaire were in- 
Post off The working total (after deduction of 
Mailin lice rejects) was 1,299. From the two 
secured” a total of 890 respondents was 
3», containing approximately twice as 
the. "n-disabled veterans (those covered by 
Provisions of Public Law 346) as disabled 
kt Bs vw 16). To check on the reliability 
P F tained data from these two follow-ups, 
Select om sample of the non-respondents was 
A g €d for persistent and repeated follow-up. 
Sampj er cent return from this non-respondent 
the ^ Was secured, the results of which altered 
Way n trends of the data in no important 
the st t is therefore felt that the data from 
two UY of the 890 respondents from the first 
Th aves possess considerable reliabilit 
Tespondents were all completed @ 


dvise- 
* 


Ph This: aa 

-D, q; article is based on a section 

jv Vssertation xU Department of Psychology, 

lag ots Pniversity, entitled “Occupational APU 

aye to 9! Counseled Veterans.” All statements e 
en © €terans are the sole responsibility d ise- 

Aqat of penel Walter Ketcham, Chie of Es 5 ns 

a New York Regional Office of the V Sch 
tion, approved the entire research projec’ 


of the author's 


399 


ments from the files of the Vocational Service 
Center of the Y. M. C. A. of New York City 
(hereafter, VSC). The two questions dealing 
with attitudes towards counseling were adapted 
from the reports of the Adjustment Service 
(6, 10). The first of these asked: Did you 
consider the suggestions made to you at VSC 
to be (check one) helpful, of doubtful value 
impractical? The second: I found the attitude 
of the VSC staff to be (check one) sympathetic, 
indifferent, discourteous. The distribution of 
the replies from the 890 returns is given in 


Table 1. 
Table 1 


Veterans’ Opinions of VSC 


Suggestions were: 


Helpful 668 (75%) 
Doubtful 169 (19%) 
Impractical 28 ( 3%) 
No reply 25 ( 3%) 
Totals 890 100 
Staff attitude was: 
Sympathetic 809 (91%) 
Indifferent 56 ( 6%) 
Discourteous 0 
No reply 25 ( 3%) 
Totals 890 100 


Despite the highly favorable opinion, replies 
falling into the other categories were welcomed 
as evidence that the returns were not producing 
a kind of “white-wash.” Many respondents 
appended further comments in support of this 
majority opinion, two samples of which are: 

Glad I visited your office. Put me on the 
my future. If not for your counselors 

I never would enjoy working today. 
for your kind service. 

Personally I found your counseling so 
impressive that I resolved to "stick to it" 
regardless of future distractions, inducements, 
etc., believing that would be the best, most 


rewarding, long-range plan. 
A total of 19 per cent indicated the counselor 
suggestions were of doubtful value. These 


goal to 
] guess 
Thanks 


400 


persons, in the main, either regarded the sug- 
gestions as unrealistic because the training 
suggested would be too extensive or else they 
indicated disappointment at being given no 
"new" information about themselves. 

A still smaller group (3 per cent) checked the 
"impractical" category. In support of these 
ratings one states: 


. . . Suggestions impractical because the 
suggested goal requires years of night study of 
an extremely complex subject (law) without 
any possibility of an adequate living for the 
years of study plus other years of attempting 
to establish a clientele; and I was and am un- 
willing to begin life at 40 . . . (continues about 
his present job). . . . A Am doing work I enjoy, 
am making a living and have time to worry 
about the world and its works. In other words 
you can put me down as "adjusted" even 
though your recommendation was not followed. 
Thanks for your help. 


Turning to a brief consideration of client 


reactions to the item concerning the staff atti- . 


tude, the overwhelming majority is again 
highly favorable. Many of the unsolicited 
comments were overly enthusiastic, containing 
direct compliments and testimonials to coun- 
selor effectiveness. The following quotations, 
from the many, will give the flavor. 


I thank you and the staff from the bottom of 
my heart. May there be more people like you. 
Keep up the good work. 

Was most favorably im 
attitude and results. 
pression given that he 
cians who could deter 
which could not fail— 


pressed by the helpful 
At no time was the im- 
Te was a group of magi- 
mine an exact endeavor 


Yet the pattern made by 
the apparently exhaustive tests was very clear, 


Can't suggest anything more helpful than the 
hope you'll continue—as is, 


A very small percentage indicated they felt 
some indifference. This is rarely explained 


and, when attempted, is somewhat garbled 
No respondent checked the “discourteous” 
entry—possibly too Strong a word here. 

More productive from the 


studying counseling from the vantage point of 
the client are the spontaneous remarks in con. 
nection with these two items! This type of 


1 The covering letter which accompani. 
tionnaire specifically requested any sow = gue. 
information, such as feelings about VSC, that th ional 
eran would like to supply. The letter also Y e vet- 
frankness and used the altruistic appeal th uraged 
information would be of at such 


great value in H 
deal with current veteran clients. ? helping VSC 


Point of view of 


W. Leslie Barnette, Jr. 


supplementary data, of course, was not re- 
ceived from all. Of the total of 890 returned 
questionnaires, 575 individuals or 64 per cent 
offered additional remarks. These comments 
run alarge gamut. They vary in length from 
a succinct thanks for help to a 13-page letter, 
from notes full of errors in grammar and spell- 
ing to-typed biographical essays, from a warm 
and receptive attitude to the questionnaire and 
the whole idea of a follow-up to irate negativ- 
istic attitudes,? from well-adjusted and secure 
individuals to persons with signs of aggravated 
emotional instability. All these comments 
were not limited to counseling matters; it wil 


Table 2 


Content Analysis of All Spontaneous Remarks 
(N = 575) 


Info. augmenting Q items 


(i.e., no new facts) 392 (70%) 
Incr. self-confidence 80 (14%) 
Specific VSC compliments 100 (17%) 
Thanks for your interest 152 (2670) 
VSC criticisms 102 (1870) 
VSC recommendations made 25( 4%) 
Preoccupied with tests 89 (15%) 
School difficulties 64 (11%) 
Job difficulties 65 (11%) 
Employer gripe 16 ( 3%) 
Feels goal unrealistic 29 ( 4%) 
Raises questions for answer 61 (11%) 
VA criticisms 12 ( 2%) 
Miscellaneous 69 (12%) 

cussed 


be only these types, however, that are a ro 

in this paper. Most of the commenting £ of 
Supplied amplification concerning their J sive 
educational history; others gave €* " ues” 
occupational information; others raise ther 
tions that demanded answers; still ? uch 
simply talked about irrelevancies- ny must 
classification of these diverse comment? hi 
be a subjective one. The categories " Avy 
Proved useful are reported in Table on? 
one respondent may have made more lie 
? The most extreme example of this is frog, psy 


i ir? 
Law 16 client with a 50 per cent disability of donna 


18 
neurosis, anxiety. a blank QU oy gu, 
with the following poten “What are fast a 
Sing Over there? Pimping for the a lot ? 

ad anything to do with you, all I got W35 So 
tape and a 20 per cent cut in my pension 


now on my name is KIZ MIAZ.” 


Reactions of Veterans to Counseling 401 


type of comment; the percentages will not 
therefore total 100. 


Increased Self-Confidence 


Beyond a general statement from many 
respondents indicating an appreciation for the 
Interest and help extended, many mention the 
fact that tests and counseling served to increase 
self-con fidence. To vocational counselors, this 
1S no great surprise. Seipp's report (10) for the 
Adjustment Service noted the same result; 
Stone and Simos (11), in their Minnesota 
follow-up, likewise. Despite the fact that no 
Specific question was asked regarding this point, 
: (14 per cent) of individuals making spon- 
?neous comments specifically mention this 
point, 
ng Tests encouraged me to continue with pre- 

pn Program; confidence increased. . . - Saved 

* trouble of finding out the hard way., 

M « In helping me realize what action r 
i f you and your staff opened a bright roa 
et ront of me, for the road I had traverse 
a joe Was always dark and had no hope to see 
ight at the end of it. 


Critical Comments on Counseling 


Respondents were not uniformly so com- 
“ndatory, On the questionnaire some 22 
tie Cent indicated they regarded the ages 
ie given as of doubtful value or impractica’. 
9 me of these went on to a fuller explanation 
e Such attitudes. While many such comments 
ence great misapprehension as to the 
Cons of a counseling agency, at least most 
Onestly offered and, in the mam, well 
SXagec, "These remarks frequently betray ia 
payar rated feeling of dependence vn 
Vs ological help. These respondente, es 
whij, © Pot definite enough in its sugges ions, 


thing Others think it is too definite; athens 
Clai too much emphasis is placed on interests, 


that” cd or measured. And others are annoyed 

the the tests and counseling did not Led 

then With startlingly novel information a 4 

broy Selves, Some feel not enough pressure |) 

aso Eht by the counselor—to “force decisions, 
* respondent phrased the matter. 

me just about 


fun 


Mean 


Whey ` Tests and interviews left z " 
titudo ey found os Confirmed ager T 
Actua Showed up my inaptitudes. LE was 
Suidance, When left the building 
"efuddled as to the next step 2$ 


started. No schooling or type of job was rec- 
ommended. . . . Perhaps in helping a party 
such as me, it is harder to point out the road 
to follow. . . . In my case perhaps I had to 
find the machinery or equipment I wanted to 
sell. . . . Only an experienced person in touch 
with that type of business could have helped 
me... 

. .. Staff courteous but not convincing 
enough. Should actually force the individual 
to try to decide whether he likes an occupation 
or not. . . . Why not arrange for actually 
visiting different occupations by potential em- 
ployees? . . . What good is testing a man 
whose test reveals he is best suited for a closed 
or already over-crowded field: spread out— 
don't limit yourself to one small part of the job. 


Recommendations Offered 


A few respondents—25 in all or some 4 per 
cent of the comment group—tried to make 
specific suggestions as to how the service might 
be improved. Some of these suggestions come 
from people who had distinctly favorable atti- 
tudes toward VSC but who felt the job might 
be done still better; other suggestions came 
from respondents with definitely unfavorable 
points of view. Actually there were only a 
small number of constructive suggestions. 

The two most frequent recommendations 
are that the counselors be more specific, do 
more “telling” and that a placement service be 
installed. The former is in disagreement with 
much of the best counseling theory; the latter is 
now an accomplished fact at VSC. Other 
suggestions are occasionally sound but there 
are others that are very unrealistic—those 
which advocate even more authoritarian ap- 

roaches to clients. In the former category 
are suggestions to the effect that more time 
should be devoted to actual counseling—more 
discussion of the individual, more "analysis" 
These are the persons who, while they 


done. 
regard the tests as valuable, want to go con- 
siderably beyond them. Pressure of time and 


the case load per counselor frequently operate 
against this. Certainly, however, wherever 
this aspect could be lengthened it would be 
desirable; it would also serve to reduce the 
ever-present potential of counseling failures. 


Test Preoccupied Clients 


There has been previous mention of signs of 


psychological dependency in some respondents, 


402 


especially those with negatively critical com- 
ments about services rendered. There is 
another—and much larger—group with some 
of these same characteristics. These are the 
clients here designated as “test preoccupied.” 
In the spontaneous comments offered, some 
15 per cent of the group indicated, in one way 
or another, that the tests were the most valu- 
able part of the service. That is to say, such 
people mention only this phase of VSC's work, 
either favorably or not; they never talk about 
benefits that might have accrued from coun- 
seling. It would appear that the sole benefit— 
if, indeed, the respondents would term it that 
—Treceived from VSC was the time spent in 
the testing rooms. Not all of these are the 
"low achievers" as described in Friend and 
Haggard's monograph (5), but some of them 
are: people with poor emotional attitudes to- 
wards work, individuals with considerable 
rigidity as regards personality structure. 
Some, however, have already shown signs of 
good achievement. But the sheer frequency 
with which such comments crop up suggests 
client orientation toward vocational counseling 
that needs changing. 

As Covner (3) and others have pointed out, 
the usefulness of aptitude tests where it is 
demanded that counselors provide specific 
occupational information has long been ap- 
parent. An approach that is not exclusively 
oriented to aptitude testing may easily be 
applied both to the preparation of the client 
for tests and to interpretation of Such results. 
The comments of this group evidence the fact 
that many leave the entire counseling set-up 
with attitudes about tests that need re- 
directing. 

Others blame the tests for lack 
The tests only told them what 
knew; there Were no tests available for some 
special occupational area, etc, A few re- 
spondents specifically refer to the tests, espe- 
cially the interest. Inventories, as being too 
suspectible to “faking.” Several respondents 
as if not satisfied with the tests administered. 
want more. Possibly they have changed their 
occupational goal but still find themselves un- 
decided anid uncertain. In any event, they 
look to “the tests” for answers. The psycho 
logical insecurity of some of these clients 3 


of specificity, 
they already 


W. Leslie Barnette, Jr. 


apparent from their comments. “A more ecd 
gerated picture of this is to be seen in clients 
who have “shopped around” for other tests. 
The most extreme case is the one jer DID 
whose long letter (13 pages) indicated he e 
been to a minimum of seven different counsel 
ing agencies and had been given tests at all. 

It should be borne in mind by the pe 
that such “test preoccupied” clients on 4 
represent a small proportion of the total Leia 
(15 per cent). A direct question ete 
test attitudes of respondents was not as a 
Had this been so, a larger number of z 
replies might well have been received. n a 
and Simos (11), for example, did ask pr 
question and found some 60 per cent W nd 
"test oriented" attitudes. Covner (3) a 
this a prevalent tendency in his clients; n 
(7) at the University of Washington P ig 
more than 60 per cent of veterans re they 
at the university's counseling center said 1" 
desired vocational tests. b 

At any rate, the counseling process ie f 
something less than a success with mandi 
this group who often need a discussion ; sem 
unconscious emotional attitudes. quem 
to have only a mild interest in capot 
present circumstances. Small wonder wer 
continue to look to “the tests" for the poca 
and continue to insist on greater and 6 sejo? 
specificity both in test results and ure 
Suggestions. Any sound counseling ape n y 
Which recognized motivation as the Ja one 
datum would be of great help here ove 
which would be organized around the poe indi- 
and correlation of the dynamics of t jective 
vidual with those of the occupational 0 : 
that is to be approved.? 


een 


[Questions Raised " 


à ions 
revived many unanswered question. rent 
group and, having had one helpful a meric” 
they turned again to the agency. NY f jend 
n in * 3 
*See Part Iv: “A Study of Job Values pon, ha 
and Haggard (5) for a conceptual basis ni (8 ao 
such a program might be built. McGtet snip 
discussed the problems of industrial lee 
Supervision from the same point of,view- 


A 
d 


Reactions of Velerans to Counseling 


the group is not a large one, in the light of all 
returns, but the extra time involved with these 
cases was extensive. A total of 64 individuals 
(11 per cent of the comment group) raised 
questions demanding some sort of reply. 
Many desired a written summary of test re- 
Sults, which was duly sent out; many more 
betrayed feelings of insecurity about present 
Status, some requesting another interview 
While others left it up to VSC. The break- 
down is as follows: 20 (31 per cent) asked for 
test scores; 31 (48 per cent) indicated further 
Counseling was needed; 8 (12 per cent) place- 
ment help was requested; 5 (9 per cent) 
miscellaneous, 

By and large these individuals are the “test 
Preoccupied” people. They are compulsive 
about the tests they have already taken; they 
Want more in order to bolster feelings of self- 
Confidence. They ask unrealistic questions 
about schooling or they want extra help in 
other ways. Some did not even wait to write 
* letter but came right in for a visit after 


trecej ent 
®ceipt of the questionnaire. 


Implications for the Counselor 


What are the implications for the counselor? 
he comments show that clients have done 
Considerable after-visit evaluation. From this 
welter of information some specific suggestions 
a counselors may be garnered. The re- 
Pondents have provided a picture of the 
rounseling process as seen from their vantage 
mat: It is evident that the typical client 
eS at this situation with two dominant 
5; namely, that the counselor will tell him 

f ât is best and that the tests will provide most 
,.* answers, 
an a good counselors, of course, really play 
;SUthoritarian role. It would certainly be 
es With the more dependent clients at least, 
cifically and explicitly to delineate the 
-client relationship as à cooperative 


one, et rativ 
arif "he essential purpose of all y eco 
he ig ication and not to inform the client - 
Yers est equipped to pursue, say, boskiesning 
at the accounting. It might be well to " 
ns, © outset that the arrival at any pe 4 
Unlike Tegarding future occupational goals 
ely: 


The i 
i he Dotion that vocational and educational 


403 


problems can be answered by tests must be 
altered. Here might well be applied a tactic 
so that the counselor secures a picture of what 
the client feels on this score at the beginning. 
"There are, after all, vocational problems where 
it would be unnecessary to administer any 
tests at all. Interview manuals often refer to 
the “preparation of the client for tests.” It is 
precisely at this stage where the counselor must 
so structure his remarks that false hopes and 
illusions about test results are not established 
in the client. Pursuing this tactic further, the 
client may frequently be allowed to participate 
in the actual business of deciding whether 
certain tests are suitable for his particular 
case; thus all responsibility is not placed on 
the counselor. Bordin and Bixler (2) provide 
excellent suggestions here; Seeman's study (9) 
on such a self-selection-test technique is also 
pertinent. 

The interpretation of test results to the 
client frequently demands all the subtlety, tact 
and skill that a counselor may muster. It is 
suggested here that it might be a useful device 
to secure first from the client a “free” response 
as to how he thought he performed. By such 
a means the counselor might quickly tell 
whether he need proceed cautiously and with 
understatement or otherwise. Often such a 
“free” response will disclose that the client has 
already done some accurate self-evaluation 
which might now be woven into the counseling 
fabric. Covner (3) has dealt with this aspect; 
Bixler and Bixler (1) offer detailed quotations 
and suggestions on this interpretation problem. 
At this stage, too, the counselor might attempt 
to forestall unfavorable reactions (frequently 
met) with clients actually launched on a job or 
school career or who are going through such 
services only because of necessary VA approval. 
The type of interpretation of test data for 
this sort of client should be quite different from 
that given to an individual who has come for 
counseling because of real need. 

Many clients are in need of a frank state- 
ment about the nature and purpose of interest 
inventories. The impression is to be avoided 
that interests, either claimed or measured, are 
of primary significance. This is frequently 
just the sort of notion many clients get, 
possibly because it is relatively easy for the 


404 


counselor-client relationship to get under way 
by first discussing likes and dislikes. At all 
costs, the final impression, prevalent among the 
critical respondents, that all the counselor did 
was to agree with originally stated likes of the 
client, is to be avoided. Similarly, too com- 
plete reliance on high points of interest profiles 
may be of doubtful validity, as Diamond (4) 
has recently discussed. In addition, a succinct 
caution is sometimes necessary about the 
function of interest inventories and about the 
possible operation of unconscious sets on the 
part of clients that might distort scores. 

Counselors should take pains to explain 
clearly the type of norm data used on which 
Taw scores are evaluated. Many veterans of 
beyond high school level are deflated when 
they learn they receive above average scores 
on tests where twelfth grade norm groups are 
the standard. This is part and parcel of the 
need of developing local norms, 

Several respondents Suggested that some 
sort of a written summary for each client be 
provided. The idea, on the surface, appears 
to be a good one. This might easily be a one- 
page sheet where test scores are indicated in 
general terms along with a brief description of 
the test itself. It is suggested that any precise 
indication of test scores—such as percentile 
ratings, sigma scores, etc.—be avoided. To 
the psychometrically unsophisticated person 
such devices tend to indicate greater accuracy 
than they possess.4 

There must be an honest facing on the part 
of the npa towards all client handicaps— 
such asa client s lack of a high school diploma, 
No Sugarcoating, as clients are quick to per. 
ceive this type of “dishonesty”; they specif- 
ically mention they resent being sent away 
happy. Here also one might employ a non- 
directive question or two, so as to get the 
client to express in his own words what he feels 
his present handicaps to be. This would, in 
turn, help to minimize the "telling? on the part 
of the counselor, 


jum perctiès con 
olicy. ere clients who desi A 
d tet results come in for iR further | nterpretation 
and the test results are then Presented verbally 
practice obviously avoids the Pitfalls inhe i 
procedure that allows the client. to draw h 
cluzions. 


W. Leslie Barnette, Jr. 


As much advance planning should go into 
the summary interview as possible. Clients 
show by their comments that they arrive at 
this stage somewhat "keyed-up." Presenta- 
tion of facts about test performance is m 
portant but is not the main function "i 
counseling. Here is where skillful weaving © 
test result data into the counseling situation 1 
paramount. If the client has. Me mo 
prepared for the tests in preliminary mer 
views, he will not expect the counselor d 
to sit back and “tell.” This interview mit 
well be the longest as regards actual bere 
Clients expect, often unwisely, great T 
here; they are disillusioned by speed, nl is 
or anything that might be interpreted * 
vagueness. 

A closing suggestion from the counselor tha 
he would be glad to talk again with the ch 
at some future date would be very accept nec 
to many. To be sure, this is generally is gt 
—but often only by the agency penser th 
is rarely stated explicitly. This then pu : 
responsibility for a follow-up squarely 9! 


able 


E 
5 nentio 
Client. The spontaneous comments quer ed 
: : veni amie 
that clients are very receptive to even à even 


questionnaire one or more years oe 
this check prompted several to call a 
office for interviews. 

It must be recognized at the outset eh 
all comers are fit for counseling, since jona 
may be dominant unconscious ee tions 
forces operating which vitiate all pes and 
no matter how carefully thought Caie 
handled by the counselor. A prt 
beginning with clients might well “arting 
some of these “unfavorables” near the 5 " i 
line. The more that is known of thes car 
namics, the better will all vocational and € 
tional counseling become.* 


the 


t not 


there 


Summary (eran? 
s 390 ve 

The reactions to counseling of 890 ^ on? 
were studied by means of replies to E anto 
mail questionnaire together with cy we 
Comments, The counselor suggest? er cor 
found to be genuinely helpful by F ihe sm 
9f the group; 91 per cent reporte i ep? 
d the? pin! 
nd Haggard have not only state pegin peit 


É eR t 
em explicitly but have made a significant i o t 


towards this l;s ic sa 
goal; see especially pag 
monograph (5). : 7 


5 Friend a 


 -—— eee eee 


Reactions of Veterans to Counseling 


attitude a sympatheticone. Counseling served 
to increase the self-contidence of many. The 
Prevalence of a “test preoccupied” group is 


noted. 


Implications for the psychological 


Counselor are discussed in terms of the com- 
ments offered by these veteran clients. 


Received February 16, 1950. 


References 


1. Bixler, R, H., and Bixler, V. H. Test interpreta- 


2. B 


4D 


tion in vocational counseling. Educ. psychol. 
Measmt., 1946, 6, 145-156. 

Ordin, E, S., and Bixler, R. H. Test selection: 
a process in counseling. Educ. psychol. Measmt., 
1946, 6, 361-374. 

Ovner, B, J. Nondirective interviewing tech- 
niques in vocational counseling. J. consult. 

Psychol., 1947, 11, 70-73. 

lamond, S. The interpretation of interest pro- 
files. J, appl. Psychol., 1948, 32, 512-520. 


5. 


10. 


11. 


405 


Friend, J. G., and Haggard, E. A. 
ment in relation to family background. 
Psychol. Monogr., 1948, No. 16. Pp. 150. 

Hawkins, L. S., and Fialkin, H. N. Clients’ opin- 
ions of the Adjustment Service. New York: 
Amer. Assoc. Adult Educ., Adjustment Service 
Report XII, 1935. Pp. 95. 


Work adjust- 
Appl. 


. Kohn, N., Jr. Trends and development of the 


vocational and other interests of veterans at 
Washington University. Educ. psychol. Measmt., 
1947, 7, 631-637. 


. McGregor, C. Conditions of effective leadership 


in the industrial organization. J. consult. Psy- 
chol., 1944, 8, 55-63. 


. Seeman, J. A study of client self-selection of tests in 


vocational counseling. Educ. psychol. Measmt., 
1948, 8, 327-346. 

Seipp, E. A study of 100 clients of the Adjustment 
Service. New York: Amer. Assoc. Adult Educ., 
Adjustment Service Report XI, 1935. Pp. 31. 

Stone, C. H., and Simos, I. A follow-up study of 
personal counseling versus counseling by letter. 
J. appl. Psychol., 1948, 32, 408-414. 


The Wechsler-Bellevue Intelligence Scale and High School Achievement * 


Arden N. Frandsen 
Utah State Agricultural College 


The validity of the Wechsler-Bellevue In- 
telligence Scale as a measure of intelligence as 
indicated by its correlations with other already 
established tests of intelligence has been re- 
ported by several writers. Its correlation with 
various other tests have ranged from .39 to 
.93 (25, p. 134). The median of six different 
correlations with the Revised Stanford-Binet 
is .87 (8, 16, 22, 25). And Altus and Mahler 
found a bi-serial r of .52 between W-B IQ's and 
graduation versus being discharged as inept 
from the 9th Service Command Special Train- 
ing Center (1). Tests, however, have multiple 
validities, which may vary with differences in 
the particular criteria one may wish to predict. 
High school counselors may be especially in- 
terested in the validity of the W-B, both as a 
whole and when abbreviated by division into 
selected subtest groups, for predicting achieve- 
ment in high school curricula. The data to be 
presented in this paper should contribute 
specifically to this question, and by implica- 
tion, also to an evaluation of the test for 
college guidance. 

Two previously reported investigations have 
contributed data on the validity of the W-B in 
predicting academic achievement at the college 
level, a situation closely related, of course, i» 
the question of the test's validity for pre- 
dicting high school academic achievement. 
Anderson (2), for 112 college women whose 
mean W-B IQ was 118.5, found correlations 
with average freshman grades of .45 for full 
scale IQ, .52 for verbal IQ, and of .23 for per- 
formance IQ. Sartain (22), for 50 college 
students whose mean IQ was 117.5, found the 
corresponding correlations of freshman grades 
with W-B IQ's to be .53, .58, and -35, re- 
spectively. These coefficients compare ve 
favorably, especially for the W-B ver 
with the modal range of correlations 
which have been reported between 


bal scale, 
(40 to .60) 
high school 


* To Mr. Carl Hammer, principal, and 
Issacs, Counselor, both of the West Laf: 
High School, the writer is very grateful fo 
and generous help in arranging for testing 
and for supplying the achievement data. 


to Mrs. Gladys 
ayette, Indiana 
r their interest 
their students 


406 


grades and group verbal intelligence tests 
(6, p. 846). 
Subjects. The subjects used in the studies 
conducted by Anderson and by Sartain were 
not selected to represent adequately the range 
of abilities of college freshmen, nor was the 
area in which achievement was appraised 
homogeneous. The subjects of the present 
study comprise practically an entire class m 
out of 90) of relatively superior high schoo 
seniors from a mid-west university city whose 
employed population consists predominantly 9 
university staff and otherwise of people €?" 
ployed in professional and business -X 
tions. A large proportion (83%) of the higi 
school seniors anticipated attending coleg” 
and were highly motivated to meet ama 
requirements. There were variations 1? i 
courses pursued, but nearly all students in- 
cluded a basic core of college-preparaten 
subjects in their courses of study. And bg 
cause these high school students were Lec 
with their own families, within a fairly SUIT, 
range of socio-economic status, their ee 
conditions of living were also probably s 
similar than they were for the college nga 
The ages (obtained when the W-B was EV 
of these 83 high school seniors ranged s "1 
with a mode of 17. The range in 
was from 86 to 145, and the mean was in 
Hypotheses. In this study, the dat “i be 
telligence and academic achievement Y? con 
analyzed to test three hypotheses. 1. w oen 
ditions more favorable for predicting achi pre 
ment (a more complete range of abilities m us 
sented, high motivation, more homoge eral 
curricula, and probably more uniform Be yen 
conditions of living), validity coefficients 
higher than those found by Anderson p. ned 
Sartain for college students may * 


be obt2 

: e 
2. Because certain subtests may dilute di 
dictive efficien 


9.8. 


r 


z 


Wechsler- Bellevue Intelligence Scale and High School Achievement 


hypothesis 2, for testing situations where time 
€conomy is important, valid abbreviated W-B 
scales may be selected wisely according to 
their separate validities in predicting a particu- 
lar criterion. Even with drastic deletions from 
the full scale, the greater saturation of the 
abbreviation with valid items (for a particular 
criterion) may compensate for the reduced 
reliability of the shorter tests. 

The first hypothesis is similar to that pro- 
Posed by Burtt and Arps to explain their higher 
Correlation between Alpha intelligence and 
academic achievement in military academies 
Compared to public high schools, 7’s being .39 
and .19, respectively. In the academies, they 
they observed, the “general supervision, super- 
vised study, and a system of reward and 
Punishment are more apt to hold the student 
to his maximum intellectual ability” (3, p. 393). 

"he latter two hypotheses are rendered plau- 
Sible by the facts that the 11 different subtests 
of the W-B were selected to test a variety of 
Mental functions and that the 45 intercorrela- 
tions reported by Wechsler (25, p. 223) for 10 

the subtests indicate only a moderate 
degree of communality among the mental func- 
tions being measured by the scale. They 
Ange, except for only 11 coefficients outside 
Als range, from .3 to .5, with the mode being 
sd lt is quite probable, therefore, that the 
datate tests are measuring to some extent 
shinee quasi-general aptitudes and/or specilic 

a lies. And the comparability of the 11 
Ju Subtest scales makes it feasible and € 
Sd to take advantage of this flexible 

ure of the W-B test. : 
d uq, ere The data of . this. study = 
Brade for 83 high school seniors: (1) average 
rieu] Polnt-ratios earned in an academic cur- 

Wum and cumulated from grade 9 through 
(3) Ww Henmon-Nelson, Form A Q's; 
te,  CChsler-Bellvue IQ's and separate su 

Weighted scores. The W-B tests were 

li To nistered to the high school students, z 
£ rent periods during their senior yea", y 
Writer’s Purdue University students m In- 


prognostic tests 


heginning of the 
owever, 


IQ's obtained in 


407 


dividual Intelligence Testing, after they had 
demonstrated to the writer adequate skill in 
administration of the test. Product-moment 
correlations were determined between grade- 
point-ratios and the IQ's for both tests and 
with the W-B subtests, singly and in various 
combinations. 
Results 


1. IQ's and Grade-Point-Ratios. The data 
for interpreting IQ's and for evaluating their 
validity in predicting academic achievement 
in this high school are presented in Table 1. 


Table 1 


W-B IQ and Henmon-Nelson IQ Data for 83 High 
School Seniors and the Correlations of the 
Tests with Total (3 years) Grade- 
Point-Ratios 


r with 
Tests Means S.D. G.P.R. 
Wechsler-Bellevue, F-IQ 119.8 11.86 .685 
Wechsler-Bellevue, V-IQ 115.7 12.96 .69 
Wechsler-Bellevue, P-IQ 119.6 10.08 AS 
Henmon-Nelson, Form A IQ 112.1 12.84 52 


Taking as a basis of comparison the Henmon- 
Nelson IQ and grade-point-ratio correlation of 
.52, which is as high as has usually been re- 
ported in the literature (6, p. 845) for similar 
prognostic relationships, the correlations be- 
tween W-B IQ's and grade-point-ratios are 
surprisingly high. As is shown in Table 1, 
the W-B verbal IQ alone and the full scale IQ 
are equally good predictors of high school 
academic achievement, the correlations being 
.69 and .685, respectively. For the perform- 
ance scale, the correlation of .48 with grade- 
point-ratio is much lower, but it is only slightly 
lower than the corresponding correlation of .52 
with the Henmon-Nelson group test IQ's. 
The relatively high validity of the W-B, espe- 
cially the verbal scale, in predicting high school 
achievement, although perhaps surprising in 
comparison with the results previously re- 
ported both for group tests and for the W-B on 
the college level, is in line with our hypothesis 
that when the factors of motivation, and uni- 
formity of curriculum and of living conditions 
are all favorable, higher validities are to be 


expected. 


408 Arden N. 
Referring again to Table 1, it should also be 
noted that the mean IQ of our group is rela- 
tively high, being 119.8 for the W-B full scale. 
Because of this fact and in the light of Gold- 
farb's contention that the W-B is relatively in- 
effective in discriminating among a group of 
superior adolescents (8), it might have been 
inferred that the test would have proved to be 
an inefficient guidance device for our popula- 
tion. The correlation results, however, are 
inconsistent with this assumption; and three 
other indices indicate adequate discrimination 
among our population of superior high school 
students: (1) the IQ's vary in range from 86 to 
145, (2) the standard deviation of 11.86 is 
only 2.64 points narrower than the standard 
deviation of 14.50 for Wechsler's standardization 
group of comparable age (25, p. 122), and (3) 
the distribution of IQ's appears approximately 
normal, being not markedly bunched toward 
the highest IQ’s. In interpreting IQ's for these 
superior students, however, local in addition 
to "national" norms are helpful. For example, 
an IQ of 120 which corresponds to a Wechsler 
percentile of 93 is the equivalent of a percentile 
score of only 52 according to the local norms. 
In relating 1Q’s to probable potential achieve- 
ment within the high school, the local norms 
are more consistent; but for vocational 
guidance, where the scope of educational and 
occupational competition js national, per- 
centiles based on Wechsler's more representa- 
tive standardization group are, of course, more 
appropriate. 
In passing, it may also be inte 
the intercorrelations between 
Nelson and the Wechsler-Bellevue IQ's. That 
the W-B verbal scale and the H-N measure 
quite similar traits is indicated by the correla- 
tion between them of .80. Between the W-B 
full scale IQ's and the H-N IQ's the correlation 
is .70, indicating fair agreement. But the 
correlation between H-N IQ's and W-B per- 
formance IQ's of only .44 indicates that the 
latter scale is measuring traits Somewhat inde- 
pendent from the verbal abilities measured b 
both the H-N and the W-B verba] tests " 
2. Correlations between Each W -B Subtest and 
Grade-Point-Ratio. As a preliminary both t 
determining the combination of subtests having 
a maximum correlation with academic achieve. 


resting to note 
the Henmon- 


. Frandsen 


ment and to selecting promising subtest com- 
binations for abbreviated scales, the mean 
subtest weights and their correlations with 
average grade-point-ratios were determined 
for each of the 10 subtests administered to our 
population of 80 high school seniors? These 
data are presented in Table 2. 

In Table 2, attention is first directed to the 
variations in subtest means. Two notably low 
scores are Digit Span and Arithmetic, tests 
which have in common the fact that they 
both involve numbers and require for successfu 
performance high degrees of attention (espe- 


Table 2 


t 
Means and Correlations Between Separate WR Subtes 
Weights and Academic Achievement (3 Years 
Cumulative Grade-Point-Ratios) for 
80 High School Seniors 


Mean* 
Wt. A 
Verbal Tests 56 
Information 11.9 "m 
Comprehension 13.0 5 
Digit Span 10.3 35 
Arithmetic 10.6 "m 
Similarities 12.5 : 
Performance Tests 15 
P. Arrangement 12.5 26 
P. Completion 12.7 p 
Block Design 14.2 uu 
Object Assembly 13.3 36 
Digit Symbol 12.6 . 
x cor 
* In computing means, lower and upper limits of Sel 
of 10, say, were assumed at 9.5 and 10.5, resp? 
: vinding? 
cially digits) and concentration. Find!" 


wl 
Such as those of Lewinski (17) togeth? e roto" 
the fairly frequent references in the tes celinf 
Cols to comments by the students 0n , est 


S » ju 

of inadequacy on the “number tests d 

that in several instances efficiency foi aa 

two tests may have been impaired 10" di- 
y have p : 


A co " 
highly motivated students by temporary plo 
tioned emotional “panics.” On oo naly 
Designs test, which involves mainly : uad 
and synthetic abilities in dealing wit | 


jons? : 
elements in multiple form-color qu 3 to i 
83 toes 
à ? The number of subjects is reduced nae sib! 
«cause for three students complete sepa” 
SCOres were unavailable. 


Wechsler- Bellevue Intelligence Scale and High School Achievement 


these superior adolescents achieved their best 
performance. 

More important as a guide in selecting valid 
subtest combinations, however, are the correla- 
tions of the separate subtests with the achieve- 
ment criterion. On the whole, as might have 
been expected from coefficients reported in 
Table 1, the verbal tests are distinctly more 
valid for this purpose than are the performance 
tests, with the exception of the Block Designs 
test. H may be interesting to note that 
among the relatively high academic achievers, 
None earned a low Block Design score. How- 
ever, several of the low achievers earned in- 
Consistently high Block Design scores. It 
Would seem that the kind of “reasoning” 
Measured by this test may be necessary for 
academic achievement, but it alone does 

guarantee it. For three subtests (Picture 
Arrangement, Picture Completion, and Ob- 
Ject Assembly) the correlations with grade- 
Point-ratios fail to meet the statistical criterion 
al Significance at the 1 per cent level of 
whence. The scatter diagram for PC, on 
^h test the ceiling limits weighted scores to 
5» indicates that for these subjects this sub- 
E not discriminative; scores regardless - 
ceili e achievement tended to approach f e 
Sco; ng. For both PA and OA, although the 
d Were well distributed over the full range, 
"uu Apparently corresponded with achieve- 

at no part of the range. o. 
iq den Valid W-B Subtest Combination. Ac- 
Which E to our second hypothesis, W -B subten 

;, £1 are separately invalid for predicting the 
rion may dilute the predictive efficiency of 
Other more valid subtests in the full scale. 
invali 2 clearly indicates three such mien 

" Subtests, the three—PA, PC, and OA- 
tical] already been mentioned as having salis 

a insignificant correlations with grad 
W.p 4tios. Moreover, although all of t ^ 
ith eas correlate significantly (41 to o 

€ full scale, it happens that, excep 
Span, these three tests also have the 
aw Correlations (.41 to .61) with the Le 
testg hole (25, p. 224). Therefore, when i = 
hay. ĉe deleted, the remaining tests $ ou : 
is T à greater saturation both with iaa 
achja ted to predict our criterion [on s 

ment) and with the traits wie 


Crite 
the 


409 


scale as a whole are measuring. The com- 
bination 10/7 (I+ C + D + A + B + Dsy), 
which includes all of the subtests except PA, 
PC, and OA, was found to correlate .765 
with grade-point-ratios, and thus supports our 
hypothesis quite well. It is recognized, of 
course, that we have not demonstrated statis- 
tically that this combination is {We combina- 
tion of maximum validity. Table 3, however, 
shows that compared to either the full scale or 
to any abbreviated combination which was 
evaluated, this combination predicts most 
efficiently our criterion. 

4. Validity of Abbreviated Scales. To meet 
the need for an individual intelligence examina- 
tion in situations where economy of time is 
important, several abbreviations of the W-B 
have been proposed and evaluated (4, 7, 9, 10, 
11, 12, 13, 14, 15, 18, 19, 21, 23, 24). In these 
explorations to find satisfactory abbreviations, 
the usual requirements in addition to economy 
of time have been: (1) mean of abbreviation 
approximating closely the full scale mean; (2) 
abbreviation IQ's correlating well with full 
scale IQ's; and (3) combinations of subtests 
likely to yield data which may be significant 
in the clinical diagnosis of personality. Cri- 
terion 1 has usually been satisfactorily attained 
by simple pro-rating of test weights or as in 
two instances (4, 15) by special factor weight- 
ings. Correlations ranging from .80 to .97 
indicate that criterion 2 has been acceptably 
Except in one investigation (11), how- 
wherein abbreviation “CVS” subtest 
pattern was found to distinguish statistically 
both aged and schizophrenic subjects from 
normal and feeble-minded subjects, the claims 
that the abbreviations would yield significant 
“clinical” data have been supported only on 
the basis of logic and clinical experience. In 
this study of W-B abbreviations we shall be 
concerned with another criterion, namely their 
validity in predicting academic achievement 
in high school. In selecting promising subtest 
combinations, we shall, therefore, besides con- 
sidering the abbreviations already proposed? 
and the "aptitude" groupings suggested by 


met. 
ever, 


3 Hecause we did not regularly administer the Vocab- 
ulary subtest, some abbreviations (10, 11, 12, 18, 19) 
luded this excellent test or an abbreviated 


which include 2 
Substitute for it (14, 24) were omitted in our evaluations. 


410 


Arden N. Frandsen 


Table 3 


i LS -B 
i i i ievi -B Full Scale IQ, the Most Valid W 
} 2 Correlations with Academic Achievement of the W-B Full 2, the 
TAS de and 12 Selected W-B Abbreviations for 80 High School Seniors 


IQ for Mean r with 
Combination of Subtests Pro-Rated Wts. G.P.R. 

a, W-B full scale IQ n 119.8 pes 

b. 10/7 (I+C+D+A+S+B-+Dsy), Highest validity 118.0 p 

1. Geil’s 5/2 (C--S4- D-- B) 121.7 p^ 

2. Patterson's “Reasoning Group” 5/2 (A+S+B+C) 18 p 

3. Rabin's 10/3 (C--A4-S) 118 p 

4. 10/3 (I+A+S), Highest 3 r’s with G.P.R. 114 p 

5. Diamond's “Linguistic Factor" 10/3 (I--C--S) 119 P 

6. Diamond's “Clerical Factor" 10/3 (D4-A--Dsy) 118.5 F^ 

7. Diamond's “Spatial Factor" 10/3 (PC+OA+B) 112 5 f 

8. 10/3 (C+A-+B), Highest r’s plus qualitative clues 121 Ais 

9. Cummings, MacPhee, & Wright's 5 (C--A) 116 ^3 

10. Gurvitz's 5 (PA+D) 112 5 

11. 5/2 (+A), Highest 2 /'s with G.P.R. ul 63 

12. 5 (A+B), Reasoning: Numerical and spatial symbols 120 60 
the factor research of Diamond (5) and Summary . 
Patterson (20), be guided by Table 2 Showing : "S sally an entir 
the correlations of separate subtests with our Using as Subjects pestes high scho? 
criterion. All of these combinations and the lass of 83 relatively superior hig llevue 


data for evaluating them are presented in 
Table 3. 

The data in Table 3 support quite well our 
third hypothesis, that concerning the possi- 
bility of finding W-B abbreviations which are 
valid for a particular purpose. 
the criterion of validity in predicting academic 
achievement, the table shows that four ab- 
breviations—numbers 1, 2, 3, and 4—equal the 
efüciency in this respect of the full scale, 
although they require only about halt the time 
for administration. Five other three-test or 
even two-test abbreviations—numbers 5,8,9 
11 and 12—besides being even more economical 
of time, are reasonably efficien 


; s t in predicting 
academic achievement, the correlations ranging 
from .60 to .65. All are superior in this respect 


to the Henmon-Nelson Stoup test of intelli- 
gence. That the abbreviations are not all 
effective, however, is clearly indicated by the 
low correlations of .37 arid .39 for Diamond's 
“Spatial Factor" and Gurvitz’s "ten-minute" 
scale. Nevertheless, three two-test abbrevia- 
tions—numbers 9, 11, and 12—predict ‘aca- 
demic achievement for these students fairly 
efficiently, 7’s ranging from .60 to .63, 


According to 


E iva Tan -Be 
seniors, the validity of the W echsler-1 i 
evalu- 
mic 


Intelligence Scale, both as a whole 
various subtest combinations, has begr le 
ated for effectiveness in predicting cac 
achievement. 

1. Both W-B full scale IQ's 
IQ's are found to predict three year ing ^ 
grade-point-ratios very efficiently, 7 be! scale 
for both tests. The W-B performant", d 
however, measures traits less significan 
lated to academic achievement. | Jeast 

2. When the subtests found separate? gted 
valid for predicting achievement are snatio™ 
from the full scale, the resulting combi! on P 
10/7 (I+ C+ D+ A+ S + Bt Dee 
found to correlate even higher (.765) p ae co 
Point-ratios, This is probably the € 
bination of maximum validity in PI? 
high school academic achievement. —, ag 2% 

3. By choosing subtest combinati with 
cording to their correlations separately ab 


the criterion, at least nine different Ww a 


e 
pal scal 
nd verba 

ar axi erage 


dictind 


a eae RET e » 
breviations of satisfactory validity ar g anf 
Correlations with academic achieveme! 

ing from .60 to .71. 1 senio" 


Our data are limited to high schoo 


Wechsler-Bellevue Intelligence Scale and High School Achievement 


but in the light of the results obtained by 
Anderson (2) and by Sartain (22), it is probable 
that these conclusions would also apply with 
college freshmen, when conditions equally 
favorable to effective prediction are present. 


Received January 12, 1950. 


References 


1. Altus, W. D., and Mahler, C. A. The significance 
of verbal aptitudes in the type of occupation 
pursued by illiterates. J. appl. Psychol., 1946, 
30, 155-100. 

2. Anderson, E. E., and others (Wilson College Studies 
in Psychology). A comparison of the Wechsler- 
Bellevue, Revised Stanford-Binet, and Am. 
Council on Educ. Tests at the college level. 

3 J. Psychol., 1942, 14, 317-326. 

. Burtt, H. E., and Arps, G. Correlation of Army 
Alpha Intelligence Tests with academic grades 
in high schools and military academies. J. appl. 

, Psychol., 1920, 4, 289-293. T 

Cummings, S, B., Jr., MacPhee, H. M., and W right, 

H. F. A rapid method of estimating the IQ's of 
subnormal white adults. J. Psychol. 1946, 21, 
81-89, . 

Diamond, S. The Wechsler-Bellevue Intelligence 

Scales and certain vocational aptitude tests. 
6. p 7: Psyehol., 1947, 24, 279-282. s 
“urich, A, C., and Cain, L. R. Prognosis. In 
Monroe, W. S, (Ed.), Encyclopedia. of educa- 
tional research. New York: Macmillan, 1940, 

7, o, DP. 838-850. 

* Sell G.A. A clinically useful abbr 

e. J. Psychol., 


eviated Wechs- 
ler-Bellevue Sca 1945, 20, 101- 
" Goldfarb, W. Adolescent performance in the 

Wechsler-Bellevue Intelligence Scales and the 

Revised Stanford-Binet Examination, Form L. 
9, g 7- duc. Psychol., 1944, 35, 503-307. 
pu M. S, An alternate short 

Wechsler- Bellevue Test. Amer. 
10, pait, 1945, 15, 727-733. 

E Tunt, W. a een S. G., Mensh, I N, “we 
Wiliams, M. The validity of some abbreviatec 
Idividua] intelligence scales. J. consult. Psy- 
chol., 1948, 12, 48-52. 
"b. W. A., French, E. G., Kleba 
Mensh, 1. N., and Williams, M- 


form of the 
J. Orthopsy- 


Mh H nofi, S. G» 


The clinical 


13. 


14. 


13. 


16. 


17. 


18. 


19. 


24. 


. Patterson, C. H. 


. Sartain, A. O. 


. Sprin 


411 


possibilities of an abbreviated individual intelli- 
gence test. J. consult. Psychol., 1948, 12, 171- 
173. 


. Hunt, W. A. French, E. G., Klebanoff, S. G., 


Mensh, I. N., and Williams, M. Further stand- 
ardization of the CVS Individual Intelligence 
Scale. J. consult. Psychol., 1948, 12, 355-359. 

Hunt, W. A., and French, E.G. Some abbreviated 
individual intelligence scales containing non- 
verbal items. J. consult. Psychol, 1949, 13, 
119-123. 

Hunt, W. A., and French, E. G. A second fifteen- 
word vocabulary test for use with abbreviated 
intelligence scales. J. consult. Psychol., 1949, 
13, 124-126. 

Kriegman, G., and Hansen, F. W. VIBS: A short 
form of the Wechsler-Bellevue Intelligence Scale. 
J. clin. Psychol., 1947, 3, 209-216. 

Kutash, S. B. A comparison of the Wechsler- 
Bellevue and the Revised Stanford-Binet Scales 
for adult defective delinquents. Psychiat. Quart., 
1945, 19, 677-685. 

Lewinski, R. J. The psychometric pattern: I. 
Anxiety neurosis. J. clin. Psychol., 1945, 1, 
214-221. 

Patterson, C. H. A comparison of various ‘short 
forms’ of the Wechsler-Bellevue Scale. J. con- 
sult. Psychol., 1946, 10, 260-267. 

Patterson, C. H. A further study of two short 
forms of the Wechsler-Bellevue Scale. J. con- 
sult. Psychol., 1948, 12, 147-152. 

Using the Wechsler-Bellevue In- 

telligence Scales. In-Service Training Section, 

Personnel Div., VA Regional Office, Minneapolis, 

Minn., 1948. 


. Rabin, A.I. A short form of the Wechsler-Bellevue 


J. appl. Psychol., 1943, 27, 320-324. 

A comparison of the New Revised 
Stanford-Binet, the Bellevue Scale, and certain 
group tests of intelligence. J. soc. Psychol., 
1946, 23, 237-239. 

ger, N. N. A short form of the Wechsler- 
levue Intelligence Test as applied to the naval 
Am. J. Orthopsychiat., 1946, 16, 


Test. 


Bel 
personnel. 
341-344. 
Thorndike, 
intelligence. 
135. 
Wechsler, D. The measurement of adult intelligence. 
Baltimore, Williams and Wilkins Co., 1944. 


R. L. Two screening tests of verbal 
J. appl. Psychol., 1942, 26, 128- 


The Minnesota Clerical Test: Sex Differences and 
s Norms for College Groups 


Olga E. de Cillis Engelhardt 
Department of Psychology, The University of Connecticut 


The Minnesota Clerical Test is extensively 
used in business and industry (5), in vocational 
advisement (3), as a laboratory exercise in the 
introductory course in psychology (6), and to 
predict achievement in selected college courses 
(4). The manual for this test is replete with 
54 different sets of norms, none of which, how- 
ever, are for college students. 

In connection with a larger research project 
being conducted by the School of Business 
Administration at The University of Con- 
necticut, the author had occasion! to admin- 


Table 1 


Norms for Eastern Arts and Sciences College Students 


Women 


Men 
. Test 1 Test 2 Test 1 Test 2 
Centile Numbers Names Numbers Neme 
99 187 189 166 172 
90 164 177 147 151 
80 148 164 137 144 
70 139 155 131 136 
60 133 144 124 128 
50 128 140 116 120 
40 121 131 109 115 
30 115 123 105 104 
20 107 108 98 97 
10 100 93 88 84 
1 80 77 75 67 
N 101 101 141 141 
ister the Minnesota Clerical Test to large 


groups of college students. 
this brief paper is two-fold: 

needed set of norms for IM E 
(2) to confirm for college students the findin 
of a sex difference in performance ited 
reported by Loevinger (2), and Thatcher Qj 
for grade school children, by Schneidler (2) for 
high school children, and by Andrew and 


* Graciously provided by Dean Lawrence J. Ackerman 


The purpose of 


412 


Paterson (1) and Schneidler and Paterson (7) 
for adults in the general population. 


Subjects and Procedure 


The subjects of this investigation were 512 
junior and senior undergraduate students A 
The University of Connecticut. Of these 14 
male students and 101 female students were 
enrolled in the College of Arts and Sciences n 
advanced courses in English, government, psy 


Table 2 


Norms for Eastern Business Administration 
College Students * 


Men 
Test 1 De 
"T Numbers Names 
99 185 m 
90 153 "e 
80 136 An 
70 130 um 
60 124 e 
50 118 m 
40 115 m 
30 110 a 
20 105 p 
10 95 " 
: 71 
N 270 = J 
rolle’ 
,. * There is not a sufficient number of women n p 
in the School of Business Administration to Wa" 
publication of norms. 
pce 


chology, and zoology. The Arts and Scan 
men ranged in age from 20 to 28; the 49 
age was 21. The women ranged in agè 
to 26; the median age was 20. The 5 

usmess Administration men consiste o 
enrolled in a course in industrial psy m 
They were primarily industry and 717^" gor! 


majors. These students ranged in aR is w 
20 to 32; the median age of this grouP, jste 
The Minnesota Clerical Test was admi” 


The Minnesota Clerical Test 413 


Table 3 
3 A Comparison of the Medians for the College Groups and Those Reported in the 
Minnesota Clerical Test Manual * 
Women Men 
Test 1 Test 2 Test 1 Test 2 
Groups Numbers Names Numbers Names 
Arts and Science Students 128 140 116 120 
Business Administration Students = -— 118 123 
Eastern Clerical Applicants 128 132 114 121 
Southern Clerical Applicants 103 102 = =e: 
Adults Gainfully Occupied 109 111 83 78 
Employed Clerical Workers nu 132 135 126 
(2, p. 8) 
Table 4 
P Means and Standard Deviations of the Minnesota Clerical Test Scores 
» Mean Raw Standard 
Tesi N Scores Deviations 
Numbers 
5 = 

Arts and Sciences Women er ete — 

Arts and Sciences Men ud 121.52 2174 

Business Administration Men = P i 

Na 

em — 101 136.70 29.35 

hi and Sciences Wome 141 119.66 25.45 

Arts and Sciences Men 270 125.44 25.17 


Business Administration Men 


Table 5 
‘ons of the College Groups with the General 


rd Deviati 4 
ay 1 Population * 


| A Comparison of the Means and Sum the General Clerica 


Population anc 


ZG = 
4 Test 1, Numbers 
; A.S. Coll 
Gen. Population Gen. Clerical A.S. College B.A. 
“i WE EE Men Women Men Women Men 
m ieee 138.6 1732 12863 121.52 
r 83.1 113.1 i sie 21.95 25.02 21.74 
a a d 59 141 101 270 
Test 2, Names 
Ea : A.S. College B.A 
G A Gen. Clerical sid 
a AimPoint Women Men Women Men 
W Men — Women al 119.66 136.70 12544 
.0 s 5 
` 776 m3 p 289 n 25.17 
: m 36.5 a 59 1H 270 
1 229 


p 
Q p. 394; 


414 


during regular class periods during the aca- 
demic year 1949-1950. © D 


Results 


Norms. 'The normative data for the college 
population are presented in Tables 1 and 2. 
A comparison of the medians obtained in the 
present study with those reported in the 
manual (2, p. 8) is made in Table 3. The 
college population medians for both tests are 
above those reported for adults gainfully em- 
ployed and southern clerical applicants, but 
are below the medians for employed clerical 
workers. The medians of this study are most 
comparable to those reported for eastern 
clerical applicants. 

Mean Raw Scores and Standard Deviations. 
The means and standard deviations for both 
tests for the college population are presented 
in Table 4. A comparison of the general 
population and general clerical population with 
the sample of this study is made in Table 5. 
College men and women obtain a higher aver- 
age score than adults in the general population 
but a lower score than the general clerical 
population. With the exception of the stand- 
ard deviation obtained in the present study, 
for women on test 2, all other measures of 
dispersion are smaller for the college group 


Table 6 


Differences between College Men and Women on the 
Minnesota Clerical Test 


Mean 
Test Means Diff. Diff. t 

Numbers 

A.S. Men 117.32 | 

AS. Women 12863 1131 310 365+ 

B.A. Men 121.52 T 

AS. Women 128.63 7111 282 232 
Names 

A.S. Men 119.66 "n 

A.S. Women 136.70 7104 362 4gp 

B.A. Men 125.44 

AS. Women 136.70 1126 330 34e 


* Significant between the one and two per cent levels 
of confidence. 
** Significant beyond the one per cent ley 


Me €l of con- 


Olga E. de Cillis Engelhardt 


than for the comparison groups (Table 5). 
This is to be expected since the college popula- 
tion is more homogeneous. 

Differences between Men and Women. The 
differences and their significances for both tests 
are shown in Table 6. All but one ¢ value is 
significant at beyond the .1% level. The dif- 
ference between Business Administration men 
and Arts and Sciences women is significant 
between the 1% and 2% level of confidence. 
The sex differences obtained in this study, 
although highly significant, are not as large as 
those reported in the literature (1, 7). The 
significant differences between college men and 
women found in this study, however, substan- 
tiate the earlier findings. 


Summary 


One hundred and forty-one male and 101 
female undergraduate students in the college 
of Arts and Sciences and 270 males in the 
School of Business Administration at The Uni- 
versity of Connecticut took the Minnesota 
Clerical Test. Sets of norms for these groups 
are presented. The sex difference as indicated 
by other studies on grade and high schoo 
children and adults is extended to include 
college students. 


Received August 14, 1950. 
Early publication. 


References 


1. Andrew, D. M., and Paterson, D. G. Measure 
characteristics of clerical workers. Bull. Emp 
Stab. Res, Inst., Univ. of Minnesota, 1934, ^ 
No. 1. 

s Andrew, D. M., and Paterson, D. G. innt 
Clerical Test: 1946 revised manual for the un ew 
sola Vocational Test for Clerical Workers 

3 York: The Psychological Corp., 1946. — , 

- Baker, G., and Peatman, J. G. Tests used 1 

erans Administration Advisement Units: 

in Psychologist, 1947, 2, 99-102. ; 
: arrett, D. M. Prediction of achievement ! 
writing and stenography in a liberal arts © 

i as J. appl. Psychol., 1946, 30, 624-630. 

: #ennett, G. K. and Wesman, A. G. ; 
test norms for a southern plant population: 

6. Orpi P. Psycho, 1947, 31, 241-247. E 

< “ison, W. D., Bousfield, W, A., and de Cillis, jog" 
pd demonstrations and exercises in Lt i 
edwards Brot ichi 192^ "gue 

7 Scheid, G. Gs and Paten, D. G.S ig 

D: i 


ences in clerical i . educ. 
1942, 33, ay PEIUS J. e 


à [innesolt 


n Vet- 
Amer 


n tyP* 


ollege: 


Industrial 


An IBM Card Profile for the Strong Vocational Interest Blank 


Wilbur L. Layton 


Student Counseling Bureau, University of Minnesota 


I In the winter of 1949 the Strong Vocational 
come Blank was administered to approx- 
of M 7,500 high school seniors in the state 
inser ii. This testing' was a part of the 
soradh ide high school testing program spon- 
Such 2 Association of Minnesota C legis. 
made Miet rs a use of the Strong Blank was 

sible by the cooperation of Mr. Elmer 

ankes of Engineers Northwest.! 


HANKES REPORT FORM FOR— 
STRONG VOCATIONAL INTEREST TEST- MEN 


cupational scales printed on the back. In 
addition, the standard scores were coded so 
that frequency distributions and hence, statis- 
tics, could be obtained for each occupational 
scale. 

The code system developed for each occupa- 
tional scale on the Hankes report forms for 
men and women is shown in Table 1. 

Tt will be seen that the code numbers 3 and 
4, 5 and 6 can be combined to form class in- 
tervals of ten standard score units each. This 
gives seven class intervals of ten standard 


Table 1 


Transmutation Table for Occupational Scales 


tional 


Sample card for reporting Strong Voca 
Interest results for men. 


I 
i h Order to eliminate the expense of manual 
fro “ation and filing of the profiles resulting 
“ards us Program, the author designed two 
e ne for men, one for women. s 
vards were designed in a way such tha 
uie ched in the front of the cards sm 
: the letter grades on the various 0 
note on the 


lj, Stron 
3phkes 1% E. K., Jr., and Hankes, E. J. A 
> 212 zest Ree "Machine. 9 appl. Psychol., 1947, 


IBM Strong 
Code Standard Letter 
Number Score Rating 
9 65-74 A 
8 55-64 A 
7 45-54 A 
6 40-4 B+ 
5 35-39 B 
4 30-34 B- 
3 25-29 c+ 
2 15-24 iC 
Y 5-14 (e 
0 —10-+4 C 
Table 2 


Transmutation Table for O. L., I. M. and M. F. 
Scales on Men's Form 


Standard 
Score 
70-79 
65-69 
60-65 
55-59 
50-54 
45-49 
40-44 
30-39 
20-29 
10-19 


IBM Code 
Number 


[22 ORUATHKO 


415 


416 


Table 3 


Transmutation Table for M. F. Scale on 
Women's Form 


IBM Code 
Number 


Standard 
Score 


20-29 
30-34 
35-39 
40-44 
45-49 
50-54 
55-59 
60-64 
65-69 
70-79 


COHN QoR OO S00 0 


score units each and one class interval of 15 
standard score units. Since standard scores 
below a minus five occur very rarely only a 
little error is introduced by assuming that one 
has eight class intervals of ten units each so 
that means, standard deviations and other 


Wilbur L. Layton 


statistics can be computed from the distribu- 
tion obtained from the IBM cards. 
The code for Occupational Level, 
Maturity, and Masculinity-Femininity scales 
on the form for men is shown in Table 2. 
Code numbers 3 and 4, 5 and 6, and 7 and 8 
can be combined to give a total of seven class 
intervals of ten standard scores each. m 
The code used for the Masculinity-Femintn 
ity scale on the form for women 1S shown 
in Table 3. d 
Code numbers 1 and 2,3 and 4, 5 and 6, an : 
7 and 8 can be combined to give a total of six 
class intervals of ten standard scores each. d 
This compromise between precise SUME 
coding and that needed for counscling ar 
adequate for most purposes. In the near : 
ture a revision of these forms will include er 
eral new scales which Hankes is planning E 
add to his present services. Shaded Es 
“chance” areas on the various scales will @ 
be incorporated in the revision. 


Interest 


Received February 16, 1950. 


J——————— RED eee 


a Edo 


cy m 


The Super-Roper Technique as a Measure of Interest in Nursing 


Leslie Navran 


Stanford University 


In 1939, a systematic program of research 
Was begun at Clark University under the direc- 
tion of Donald Super (1, 4, 5, 7, 8) which was 
designed to explore the possibilities of an ob- 
Jective technique of interest measurement 
Which involved a motion picture presentation 
p interests Super's rationale was that a real 
Geet oe a vocational field would be mani- 
Tss m attention to the film, and the close- 
iet of attention and hence the amount of 

est could be measured by a post-film test 

9r recognition of what had been seen. 
his paper is concerned primarily with the 
al study, reported by Super and Roper 
the s They showed a strip film (6) o 
vip. ee involved in nursing, as wel 5 
field i tormata about opportunities in the 
girls 5 35 nurses and 111 high school — 
Durs, of whom expressed a desire to 80 into 
Sing. The subjects were not told in ad- 


Va ; 
Hors of the post-film test. The Strong Voca- 
he Otis S-A Test 


‘onal Interest Blank and t 

ental Ability (Higher Form, A) were also 
he istered. For both nurses and ange 
em ee between the Super Test o 
„crest and each of the other tests were zero. 
ue, erPretation suggested Was that es 
than , Test was measuring something ot Et 
€rng intelligence or subjective interest pe 
theres The authors argued that it was logical, 
e ore, to assume that the Super was i 

"Sure of interest in the occupation itself. 
Was 9dd-even reliability of the test; fari: 
Super Further analysis revealed 
Stir > test differentiated the nurses from the 
Who we! and within the latter group» os 
Not, anted to be nurses from those who 


initi 


studies which 


Older 
4 i I 
We (4) has summarized the ope ily 


re 
be . Subsequently carried out. gen 
ed here that Bernstein (1) reporte E: 
ly e tion in the low .30’s between the Supe 
Whe test for the metal trades and the pe 
thar Subjects expected a post-film test, 


type iiber and Haddad (7) found that ie 
€St results were not significantly correla 


Correla 


417 


ed with familiarity with or information about 
the occupations covered by the film. Older’s 
(4) study has been the last reported to date. 
His results, based on a film which covered not 
one, but a range of occupations, were similar, in 
many respects, to those reported by Super and 


Roper. 
The Problem 


While Super’s goal was to achieve a test 
which would be occupationally comprehensive, 
an equally valuable goal would be the further 
development of such a technique for a given 
specific field. Roper’s (5) results raised the 
questions of: (1) the usefulness of such a tech- 
nique in the selection of students for nursing; 
and (2) the relation between a high interest 
“in the work itself" and achievement. She 
made an attempt to investigate these ques- 
tions, but the negative results were inconclu- 
sive, since the criteria were considered defective, 
and the sample included only szccessful train- 
ees. This study was an attempt to make a 
more thorough investigation of the above ques- 
tions. The focus of interest was not on the 
Super Test of Interest in Nursing per se, but 
in the technique and its potentialities. 


Procedure 


The subjects were 70 girls entering nurses’ 
training in September 1947. Forty-four were 
divided among three hospitals affiliated with 
San Jose State College, and the remaining 26 
were enrolled at a hospital affiliated with the 
Stanford University School of Nursing. 

A Personal Data Sheet and an Information 
Blank, both of which were devised by the 
writer for this study, were added to Roper's 
(5) test battery. The Information Blank con- 
sisted of 15 items which formed a continuum 
of nursing knowledge ranging from that which 
is commonly known to that which is more 
likely to be known only by nurses. The test 
was pre-tested on a nursing sample and on a 
sample of senior high school girls not interested 


in nursing as a career. 


418 


In order to motivate the subjects, the bat- 
tery was administered as a part of their griene 
tation program. The order of the tests was 
as follows: Information Blank, film and Super 
Test, Personal Data Sheet, Otis, and Strong. 
Except for the new procedures, this was the 
order used by Roper. All the girls were free 
from other commitments for at least one hour 
before testing. Ventilation, visibility of the 
screen, and other conditions were as optimal 
as possible. . 

Two changes were made with respect to the 
film: (1) one of the film's 40 frames was ex- 
posed for 17 seconds instead of the usual 15, 
to insure its being completely read; and (2) 
the subjects were instructed to expect a post- 
film test which would be based on the film. 
This has been acknowledged by Super as a 
practical necessity. With these exceptions, 
the comparison made later with Roper's data 
is between data obtained under essentially 
the same conditions. 

The scoring of the Otis and Strong was in 
accordance with their respective manuals. 
The scoring key for the Super Test was sup- 
plied in Roper's (5) thesis, and verified by both 
Roper and Super. 

The following two criteria of achievement 
were selected: (1) grades in courses taken 
during the first year of training, and (2) ratings 
of performance of duties while on the ward. 
Test results were withheld from the hospital 
and school administrators until the data rel- 
evant to the criteria were obtained, in order 
to avoid contamination of their judgements, 


Results 


In the first two years of trainin 
left the San Jose group of trainees 
married and six because of schola: 


ties. One girl left the Stanford 
married. ‘ 


Comparison of the two s 
sonal Data Sheets yielded these major differ- 
ences: (1) The Stanford girls were a more homo- 
geneous age group and were two years older 
in median age. (2) Eighty-seven per cent 
of the San Jose girls had one year or less of 
college education, in contrast to the one Stan- 


g, ten girls 
; four to be 
stic difficul- 
class to be 


tudent groups’ Per- 


! Personal communication. 


Leslie Navran 


ford girl (four per cent) who had less ee 
college years. This was a "db Y = 52 
ence (CR. = 7.0 p;« 001). (3) ; i ee 
per cent of the 19 San Jose girls who Had dude 
college education had major subjects r : oe 
to nursing. In contrast, 88 per ge z 
Stanford girls were nursing majors (C.R. 
2.6; p « .01). a h group 

The percentage of the girls in each lt 
having work experience and previous § i 
contact with medical persons was ap 
mately the same. For both groups, pic 
cision to enter nursing was based on a — : 
main reasons: a long-term ambition to E 4 
nurse, a desire to help and work with ng 
assortment of practical reasons, and à inni 
for the work. The most frequently p? 
reason was: discussion with, and urging "Y 
friends, family and counselors. d 
only 12 of the 70 girls reported having 
psychometric guidance. s 

The Information Blank significantly 
entiated all the groups to which it was 
istered (see Table 1). . 

Going from the non-nurses to the i : 
and nurses, there is a significant increas 
the mean scores, a consistent drop B nge. 
variabliity, and a narrowing of the ur 
These results are what one would yc e y 
the test were valid. The test also vosque 
discriminated the more highly educatec © 
presumably better informed about nU R- 
Stanford group from the San Jose group s” he 
= 3.755 p. «c MN. Although it e not 
argued that the small number of items € 


differ- 
admin- 
ainees 


e in 
the 


rsing 


Table 1 


P " 5 
Summary of Information Blank Score 


H.S 
Seniors 


(N = 82) 


Nursing 
Trainees ) 


Critical Ratios 


Nurses X Trainees 
Trainees X H. S. Girls 
Nurses X H. S. Girls 


"— eee 
€——————— Hán 


l— 


Super-Roper Technique as a Measure of Interest in Nursing 


Table 2 


A Comparison of the Most Similar Components of 
Roper’s and Navran's Sample with Respect 
to the Otis, Strong, and Super Tests 


Roper Navran 
I. S. S. 
Girls uates 
Indicating Entering 
Choice of Nurses' " 
s Nursing Training Critical 
Pest (N=33) (N=4) Ratio 
Oris (Mean 108.0 110.8 90 
S.D. 9.9 7.95 
Stron ie 57.1 40.1 5.57 
S\S.D, 65 187 (p < 001) 
Super {Mean 51.1 87.5 12.59 
S.D. 120 13.2 (p < .000) 


ea all Possible areas of nursing information, 
ite, o Dtained results indicate that these 15 
tion ‘ RES à valid sampling of nursing informa- 
info, Which successfully differentiates better 
med from less informed groups. 

he mean Otis IQ of the Stanford group was 
ine While that for the San Jose group was 
faile (C.R. = 21;p < 04). Only the Strong 
Stanton discriminate the two gum 
lean xu was 43.8 and the Sa 

in pr Test. For the purpose of investigat- 
Brou 1 test’s reliability, all 70 subjects Cn 
Versi d together, ‘The correlation of the a 
Taiseq Sven items for the whole test was = , 
The -70 by the Spearman-Brown a " 
With m Our parts of the test correlated . : 

View entire test. This supports Roper 

“A sree that Part V is virtually useless. ] he 
Parts hon reliability coefficient of these v 
Parts Vas 48, raised to .65. The first = 
Tog,’ Ported by Super and Roper (8) as the 
Non, Valuable for discriminating nurses from 
“89, “tses, had a reliability of 52, raised to 


le prbarison of Roper’s Results with Those of 
loge, an Study. As the San Jose group was 
Sempl, to the 33 high school seniors in Roper $ 
the s With respect to educational level, i 
Thi. Ores of these two groups were compared. 

“omparison is of interest, despite the San 


419 


Jose group’s being more highly selected (see 
Table 2). 

The two groups were not significantly dif- 
ferent in intelligence, as measured by the 
Otis. However, the two groups differed signif- 
icantly on both the Strong and Super tests, 
Roper’s sample having a marked edge on the 
Strong, while the subjects used in this study 
were superior on the Super. 

The Strong Test comparison is clouded by 
the fact that a new revision of the test was 
used in this study. Ten items of the older 
form were cut out in this revision. Accord- 
ingly, some of the difference may be an artifact 
attributable to the Roper group’s having a 
greater number of scoring opportunities. 
Probably of greater importance is the likeli- 
hood that the two groups, being seven years 
apart, actually differ with respect to the nurs- 
ing pattern of likes and dislikes. 

The Super Test difference may be accounted 
for by three factors: the extremely low reli- 
ability of the test, the fact that the subjects 
in this study expected the test, and the differ- 
ence in selection. 

Test Intercorrelations. A comparison of the 
interrelationships among the tests with the 
intercorrelations obtained by Roper (5) is 
presented in Table 3. 

The correlation between the Super and Otis 
tests is the only one noticeably changed; the r 
of .345 is significant. This increase is in accord 
with the results reported by Bernstein (1) and 


Older (4). 
Table 3 


son of the Obtained Test Intercorrelations 


A Compari: 
siiis with Those Reported by Roper 


Roper Roper 

Navran Nurses H. S. Girls 

(N270 (N=35) (N=111) 
S x Otis 345 .006 14 
idi (p «.004) 
Super X Strong .055 —.008 001 
Strong X Otis .016 —.04 .00 
Information X Otis .395 HEN one 

(p «.004) 
Information x Super EA E " 


Information X Strong 


420 


The relation between information and intel- 
ligence, as measured “by these tests, was 
positive and significant. The Super and In- 
formation Blank showed a positive but insigni- 
ficant relationship. Evidently, the two tests 
are measuring essentially different functions. 
These results agree with those of Haddad and 
Super (7). 

Comparison of Successful and Unsuccessful 
San Jose Trainees. An analysis of the test 
performance of the ten girls who left training, 
as compared to the 34 who remained, revealed 
no significant differences. However, the evi- 
dence indicated that those who left to get mar- 
ried were students as promising as those still 
in training. The scholastic failures scored 
lower than the marrying girls on each of the 
four tests, but only the difference in intelli- 


Table 4 


The Correlation of Tests in the Battery with 
the Criterion of Grades 


Stanford- 
Test San Jose Lane Critical 
(N = 34) (N = 25) Ratio 
Otis 33 (p <.06)  .65(p«.001) 15 
Information 45 (p < .01) 28 
Super 20 12 
Strong 03 —.04 


gence was significant (t — 
.05). 

The Relation Between Test Scores and the 
Criterion of Grades. For each student group 
test scores were correlated with the average 
number of grade points for courses during the 
first school year (see Table 4). The 12 grades 
from A to F were equated with numerical 
values from zero to eleven, and the usual form- 
ula for computing grade (honor) points was 
used. The grade point average was obtained 
by dividing the total number of grade points 
by the total number of course units, 

Neither interest measure correlated signifi 
cantly with the criterion. The Otis correla. 
tions were low to moderately Positive, and it 
would be the most useful test in the atte 
for prediction. Only in the San Jose Seid 
was the correlation for the Information Blank 


232.8 d£; p < 


Leslie Navran 


Table 5 


The Correlation of the Parts of the Super Test 
with the Criterion of Grades 


5 Si Stanford-Lane 
wes (N = 25 
Part I AL (p < .02) 61 (p< 01) 
Part II — 20 n 
Part III — 004 p 
mL 35 02 
Parts I, IT, III — 002 E^ 
Total Test 20 > 


significantly positive. This suggests the PO*' 
sibility that with increased intelligence the 
function of previous information is minimize 
in mastering course work. f the 
The relationship of the various parts © 
Super to the criterion is shown in Table ?- , e 
Only Part I, having 21 multiple-cho*" 
items, was significantly related to the wae 
in each group. The variability of the on 
tions of the other parts, both in amount s 
direction, was evidence of both their unte 
bility and their lack of relationship t° 
criterion. d th 
The Relation Between Test Scores a 
Criterion of Ralings of Performance of D 
the Wards. Only the ratings for the 
Clara County Hospital and Stanford Hosp ve 
- ais were suitable for analysis (see 
The correlations for the Santa Cla 
are all lower than the Stanford corr 


ra grouP 
elation 


* » group" 
but the rank order is the same in both £ Jow». 


The Super Test correlations are not 0™Y 


Table 6 


ach the 
me Correlation of Tests in the Battery with 
Criterion of Ratings of Performance 
of Ward Duties 


po 

Santa Clara stantial 
County Hospital N^ 2 

me (N = 11) ( 
Otis 3 : 
Strong 10 ai 5 
Super —.32 2 

—32 04 


Super-Roper Technique as a Measure of Interest in Nursing ` 


Table 7 


The Correlation of the Parts of the Super Test with 
the Criterion of Ratings of Performance 
of Ward Duties 


Santa Clara Stanford-Lane 
County Hospital Hospital 
(N= 10) (N = 24) 
Part I -47 E" 
Part I 07 —.06 
3 i 
Part III —.76 (p <.01) .06 
Fett IV —.25 —.26 
Parts 1, 11, III —.38 08 
Total Test —.322 =.25 


but negative. Only the Otis is correlated 
Positively with the criterion in both groups. 
owever, none of the correlations are signifi- 
cant, 
Table 7 summarizes the correlations between 
the parts of the Super and the criterion. 
he correlations for the Santa Clara group, 
ased on only 11 cases, are not considered as 
reliable as the Stanford correlations; but 
neither set of coefficients indicates that the 


"Ier is related positively to the criterion. 


Discussion 


peg ciability and Validity. It has been re- 
Stee a demonstrated that the Super e 
Prese, discriminates groups; but in view of : : 
Duce Tesults, it is pertinent to question wha 

. on is being discriminated. 
adva, ere are two objections to the hypotheses 
l nced by Super and Roper (8), and also by 
à ®t (4), that it is logical to assume the Super 
itse] 'S measuring interest in the occupation 
€cause of its low correlations with the 


n : 
liabi and the Otis. First, the sr on 
y i .70 for 
total coefficient, corrected, of ba the 


the c "per Test, is noticeably lower t 
Ta en ficient of .83 previously reporte 
Sven together with the fact that any : 3 
deca reliability coefficient will be too igh 
is Not the assumption of uncorrelated m 
è loy met (2), the test’s reliability 15 of s s 
Validit order as to cast serious doubt on 1 
Set r Moreover, such low reliability WO : 
€crease all correlations of the Supe 


421 


with other tests, making the /rue relationships 
among them ambiguous. Since its relation- 
ship to intelligence is already significant, and 
that with information of near-significance, the 
possibility must be entertained that increased 
reliability for the Super-type test might show 
these factors to play a more significant role 
than has been thought. 

Secondly, there is a serious danger in the 
reasoning of the authors cited above: if one is 
to ascertain what a test measures by eliminat- 
ing what it does mo! measure, then all the 
variables which might reasonably be important 
should be considered. Unless this is done, 
any conclusion must be somewhat precarious. 

Interpretation. of Results. The lack of rela- 
tionship of Super Test scores to each of the 
criteria cannot be attributed to mechanical 
imperfections in the assigning of grades and 
ratings. Space does not permit the extended 
treatment given this issue elsewhere (3); 
suffice it to say that inspection of the data re- 
vealed the spread of grades in the courses 
covered .to be quite representative. Dis- 
crimination was also enhanced by the treat- 
ment of the data. With respect to the rat- 
ings, the present study is admittedly weak 
because of the small number of cases. The 
lack of a uniform rating system for all the 
San Jose hospitals, as well as an opportunity 
to instruct the raters, made the loss of so many 
San Jose subjects unavoidable. Although the 
small size and relative homogeneity of the 
Stanford group Was regrettable, it is doubtful 
if many nursing schools have given as much 
thought and consideration to the adequacy of 
their rating system. Accordingly, there seems 
to be no reason to question the quality of the 
ratings for this group. — FR 

The Super’s low reliability, with its depres- 
sant effect on the intercorrelations with the 
offers the most parsimonious interpre- 
tation. The amount of error in the Super is 
so great as to make it uncertain just what the 
test is measuring. C onsequently, the obtained 
results cannot be viewed as reflecting the 
relationship between objective interest and 
achievement. The true relation will be open 
to investigation when two conditions are satis- 
fied: (1) a reliable measuring instrument is 
made available; and (2) it is positively estab- 


criteria, 


422 


lished that the technique is measuring objec- 
tive interest. 

In connection with the latter point, atten- 
tion should be given to the possibility that 
something else other than interest is being 
measured. As presently constituted, two- 
thirds of the items in the Super Test of Interest 
in Nursing are descriptions of pictures which 
may or may not have been seen in the film. 
All the items in Older's (4) film were of this 
type. Thus, the tests favor those subjects 
who best recognize the word descriptions cor- 
responding to the images they have of the 
pictures they have seen. Super et al. make 
the assumption that this ability stems primar- 
ily from interest. It might prove enlightening 
to test the relation between a laboratory test 
of visual imagery and performance on the 
Super-type test. The relation of such per- 
formance to intelligence might be more clearly 
and adequately defined if the Otis were re- 
placed by a test which measures specific ap- 
titudes or abilities. It might also be that, in 
this Study, the Super technique is not discrim- 
inating between more and less interested 
people, but, instead, between groups whose 
differential amounts of exposure to medical 
stimuli make their perception of nursing scenes 
proportionately more meaningful. This differ- 
ential might be because of interest, but it might 
also be a matter of circumstance, > 

Implications for Future Research with the 
Super Technique. Future research should be 
concerned first with positive efforts to estab- 
lish what the technique measures, The fac- 
tors cited above, as well as other possibilities, 
should be investigated. Coordinated with 
this should be attempts to revise the existent 
Super-type tests so as to increase their relia- 
bility. In making these revisions, the results 
of this study indicate that multiple-choice 
items, used in Part I of the Nursing Test 
should be given greater weight; i.e., increased 
in number. Not only did Part [ correlate 
significantly with grades, but this is the most 
reliable type of item employed by Super. The 


establishment of interest as the variable being 


Leslie Navran 


measured and the construction of a reliable 
instrument would then permit the investiga- 
tion of the relation of objective interest to 


- achievement. 


Summary 


1. The relation of scores on the Super Test 
of Interest in Nursing to the criteria of grades 
and ratings of performance of duties on the 
ward was not significantly above a chance 
level. 

2. In view of the low reliability of the Super 
Test, the above results are ambiguous. The 
variable being measured may or may not HE 
more related to the criteria. Such a deter 
mination must await a more reliable instru- 
ment. 

3. It will be worthwhile to establish guae 
positively that interest is the variable bene 
measured. The uncertainties resulting Hon 
the technical defects of this and previo 
studies make desirable the further investigatio 
of such other factors as experience and spec! 
aptitudes. 


Received January 6, 1050, 


References " 

mel 

1. Bernstein, Benjamin. 4 test of interest in the ni- 
trades. Unpublished Master's thesis, Cla 

versity, 1940, 

- McNemar, Q. Psychological statistics. 

. John Wiley and Sons, Inc., 1949. » ent i" 

3. Navran, L. The relation of interest to achieven nfor 
nursing. Unpublished Master's thesis ^ 


New york! 


University, 1949, eres 
4. Older, H. J. An objective test of vocational intet 
J. appl. Psychol., 1944, 28, 090-108. , ur 
5. Roper, Sylvia A. A test of interest in nursi sity 
Published Master's thesis, Clark UP" 
1940, ciety 
6. Strip film: Nursing as a career. Chicago: ` 
for Visual Education, Inc. fiect ° 
7. Super, D, E., and Haddad, W. E. The * reco 
familiarity with an occupational field on $5 psy” 
i ee of vocational interest. J- ^ f i 
tol., 1943, 34, 103-109. jectiv 
8. Super, D, E., dnd Roper, Sylvia A. An obje* 7, 


; : interes? 
technique for testing vocational inter 


appl. Psychol., 1941, 25, 487-498. 


Acuity Differences between the Two Eyes and Job Performance 


Newell C. Kephart and Joseph M. Mason 


Occupational Research 


The importance of vision in industry is 
widely recognized, and the relationships be- 
tween certain visual skills and numerous as- 
ina jab performance have been emphasized. 

Stigations by Kephart (3, 4), Stump (6), 
has Tiffin (7, 8, 9, 10) have convincingly dem- 
rated that visual skills are related to job 
d cla for they have found very marked 
nde between these skills and hourly 
ewes lon, rate of labor turnover, accident- 
€ss, supervisors’ ratings, and other ac- 
Ceptable industrial criteria. In general, these 
eee merous other studies had been con- 
Siri ee individual jobs, and not with a 
Dasic die industrial jobs to determine existing 
ic relationships of general applicability. 

m investigate these general relationships, 

CCormick (5) studied a group of approxi- 
dt 5,500 employees on 92 jobs ina number 
ies ae industrial plants. From his stud- 
Ment and that the visual aculty cd 
Jobs S for acceptable preformance on these : 

hat Seemed to be general and relative, anc 
ity in vr i minimum of far and near acu- 
Satisf, both eyes was especially pertinent. to 

bie Y performance. Beyond such a 

meg, the percentage of high criterion em- 
Aug, isc continuously DT a 

e ationshi , but at a more palle zal E 
Wa p of job success to worse eye ac, y 

eee Studied, and he reports that ~ 2 
Crease M probability of job success wi ds 
Const in worse eye acuity 15 relatively 
fara ant for the entire range, both forn 

Cuity, 

io rink as reported, studied = 
he Ships between job performance 
ie acuity, near and far, and between J 
fap Mance and worse eye acuity, neat an 
Visio where in the published literature on 
Of the studies ^c able to find an analysis 

the yi ; are we able law 
Yes, sual acuity differences between the 


a : ; es 
to job Nd the relationship of these die 
n jg Performance. The present invests 


directed toward that objective. 


ear and 


only the 


Center, Purdue University 


Statement of the Problem 


For a number of years, the Occupational 
Research Center (11) of the Division of Ed- 
ucation and Applied Psychology, Purdue Uni- 
versity, has been actively engaged in the devel- 
opment of vision-test profiles. In order to as- 
sign employees to specific jobs in which their 
chances of success are greatest, it is necessary 
to evaluate the different visual requirements 
of each individual job and to measure in some 
adequate way the visual skills of the applicant. 
Industrial and business organizations, sub- 
scribing to the vision service,’ use the profiles 
developed in the Occupational Research Cen- 
ter to identify individuals whose vision does not 
meet the general visual demands of their jobs 
and who could possibly benefit from profes- 
sional eye care. The profiles are also used as 
an aid in employee selection. 

One question, unanswered by any of the 
previous vision studies, has to do with the 
relationship, if any, between job -performance 
and acuity differences between the two eyes. 
In practically all vision tests, both eye acuity 
ismeasured. Then with first one eye occluded 
and then the other, measures of left and right 
eve acuities are determined. It has been ob- 
served that in a large percentage of individuals, 
one eye will be considerably better than the 
other. Is the “one-eyed” individual handi- 
capped in an industrial job? Or does the 
worker with one very good eye and one very 
weak eye perform on a comparable level with 
the worker who has two very good eyes? 

Thus, the present study is an extension of 
the research of McCormick (5), on the rela- 
hip between visual skills and job perform- 


tions! i panas F 
The specific objectives of the investi- 


ance. 


gation are: 
g P , : 
1. An analysis of the over-all relationship 


between better eye acuity, near and far, and 
job performance of employees on a number of 
different industrial jobs. 


1 Industrial 
Optical Company, 


Vision Service of the Bausch and Lomb 
Rochester, New York. 


423 


424 


2. An analysis of the over-all relationship 
between acuity differences in the two eyes and 
job performance. "T . 

3. After sub-dividing the total group into a 
number of sub-groups, on the basis of better 
eye acuity, an analysis of the acuity differences 
between the two eyes, as related to job per- 
formance, for each of these better eye acuity 
levels. 

Procedures 


Basic Data Used. The basic data used for 
this investigation were Ortho-Rater? test re- 
sults, and job performance measures available 
in punched-card form in the Occupational Re- 
search Center. The data covered approxi- 
mately 5,600 employees on 92 jobs in a number 
of different industrial establishments. With a 
few exceptions, they are basically the same data 
used by McCormick (5) for his investigation. 

Vision Tests. Of the twelve vision tests 
included in the Ortho-Rater battery, only four 
were given attention in this investigation; these 
were: Near test (given at the optical equivalent 
of 13 inches), acuity, right eye, acuity, left eye; 
and far test (given at the optical equivalent of 
26 feet), acuity, right eye, and acuity, left eye. 

The better eye scores, whether left or right, 
for each individual, were used directly. By 
comparing the near acuity test scores of the 
left and right eyes for each individual, the one 
with the higher score was considered to be the 
better. The far acuity test scores for the left 
and right eyes of each individual were similarly 
compared and the better eye determined, 
Where both left and right eye had the same 
acuity score, the treatment was the same; for 
such an individual, the better eye and worse 
eye scores would be identical. 

The difference between the better eye score 
and the worse eye Score, for the near acuity 
tests, was then determined—a simple process 
of subtracting the one score from the other, 
A similar computation was made for all far 
acuity better-eye and worse-eye scores, The 
four measures of visual acuity that were ac- 
tually used in the investigation were then: 
near acuity, better eye; near acuity, difference 
between better eye and worse eye; far acuity 
better eye; and far acuity, difference between 
better eye and worse eye. 

In the Ortho-Rater acuity tests, the range of 
possible scores is from 0 to 15. These scores 
can be readily converted to other measures of 
visual acuity, such as the Snellen notation or 
the AMA acuity ratings, as expla; 
manual of standard practice f 


2 The Ortho-Rater is a Stereoscopic-type vis 
ual i. 
ing instrument manufactured by the Beech E Tens 
Optical Company, Rochester, N. Y. d 


Newell C. Kephart and Joseph M. Mason 


Rater (12). With the large wood ub 
results available for this studv, the distri TR 
of better eye scores covered the entire mer 
although at both the lower and UL veli 
the number of results included were re 1 ond 
small. The greatest difference ea dual 
eye and worse eye acuities for any indivi 
was thirteen score points. J m 
Criteria of Job Performance. From ge 
dustrial plant to another, and from gne ae 
another, the criteria of job performance va ie 
considerably. In some establishments urly 
criteria were based on such factors as ns the 
production and earnings; in other plan dt 
measures of job performance were n n 
supervisors’ ratings. In general, how cae 
regardless of the basis of the criteria, th ories, 
urements were reduced to numerical LY s 
and these were subdivided into two employ" 
identified as the “high criterion group loyees: 
ees and the "low criterion group cP er 
Occasionally where there was an odd "i im- 
of numerical categories, it was obvious into 
possible to subdivide the middle categ and it 
the “high and low criterion’ groups, es from 
became necessary to omit these employ missions 
consideration in the analysis. These E num- 
account for the apparent discrepancy. initia 
bers of test results analyzed; from bees abo 
group of approximately 5,600 employ ne of the 
5,100 could be definitely assigned to o 
two criterion groups. . Better Eye 
Data on the Relationship between S 
Acuity and Job Performance. In ana A acuity 
over-all relationship between better €y he test 
and job performance, summaries © 
Scores and the measures of job per 
were derived from the basic den re 
acuity, better eye, the per cent of al n criterio? 
at a given score level who were high 
group employees was determined. e leve ^s 
centages were computed for each scor per ent 
the better eye, from 0 to 15. Similar Por fa 
were determined for each score Tr E 
acuity, better eye. Regarding thi opulatio’ 
only one sample out of a larger P es s th 
aving practically the same properti optal" 
ones obtained, it seemed desirable, ingirrtf, 
“smoothed” percentages, thus minim? e bY y 
ularities in the data. This was Sd (2 illu 
System of r or 
trates this 


ace 
rmat 
fo deat 


yhich, 
x : re i 
are averaged with the given score" cCorm al 


allowed to carry double-weight. J 
€ y double-weig «one 
(5), in smoothing his data, assi8? his b 
Weight to all three scores, and nese me of 
etermined the average. Both of E a art e, 
es dL, smoothing" were applied Ud 
a 


3 f 

a, and gave essentially the es metho E 

50 it was decided to use the ae ra 
he “smoothed” per cent determ 


Acuity Differences between the Two Eyes and Job Performance 


given score was the quotient of the sum of the 
frequencies of all (both high and low criterion 
groups) employees for the corresponding three 
Score levels. “Smoothed” percentages were 
thus determined for better eye score levels, and 
the resulting per cents subsequently plotted. 
Data ‘on the Over-All. Relationship between 
Acuity Differences in the Two Eyes and Job 
Performance. To analyze the over-all relation- 
Ship between acuity differences between the 
two eyes and job performance, it was necessary 
to derive the score dillerence for each indi- 
Vidual employee from the basic data of acuity 
Scores for the two eyes. The electric punched- 
card tabulation equipment available in the 
Ccupational Research Center greatly facili- 
tated this phase of the investigation. The 
approximately 5,600 cards were sorted as to 
tter eye acuity, and then in a second sorting, 
according to worse eye. Then the difference 
etween the score of the better eye and the 
Score of the worse eye was punched on the card. 
di © study the over-all relationship of acuity 
(illerences between the two eyes, employees 
The grouped on the basis of this difference. 
oe all those showing a difference of three 
i Te points between the two eyes were placed 
ines group; such a category would then 
Ev ude an individual scoring 3 on the better 
ye and 0 on the worse eye, as well as an em- 
the 5E Scoring 15 on the better eye and 12 on 
di Worse eye, and all others where the score 
Terence between the two eyes was three 
wee In similar manner, all employee test 
if ts were categorized according to the acuity 
erences between the two eyes. The per cent 
Were l employees at each difference leyal who 
mine ish criterion group employees was pter- 
each M hese percentages ue peer a 
acuity tty difference level boti 2 i 
minimue S and the near acui A am 
Sam Tis the irregularities in the data. 
Puteg tS “smoothed” percentages were com- 
for each difference level in the man- 
Subsequently, these 
lotted to fur- 
Il relationship 
an 


n 

"im described above. 
isp POthed" percentages were p 
beta Visual picture of the over-a i 

job een acuity differences in the two eyes 


Perfory 
mance. : p 
en ata on the Relationship between Acuity Differ 


Each Un the Two Eyes and Job Performance for 
an Better Eye Acuity Level. The test scores 
the p easures of job performance derived from 
on gic data were divided into 15 sub-groups 
mpe, basis of better eye acuity Score ; 
then eS represented in any given sub-groul 
ye Ad the same acuity score for the better 
ye Whereas the acuity scores for the worse 
uld vary anywhere from the given acuity 
€ better eye down to, and including, 
This finer breakdown into bette 
levels was necessary in order to 


425 


study the relationship between acuity differ- 
ences in the two eyes and job performance at 
each of these levels. For each sub-group, with 
a constant better eye acuity score, the per cent 
of the total at each difference level who were 
high criterion group employees was determined. 
Differences between the two eyes at each better 
eye acuity level were treated for the far tests 
as well as for the near tests, and the percent- 
ages of high criterion employees for each group 
were determined. "Smoothed" percentages 
were then calculated for all of these sub-groups 
in the manner described previously. To pre- 
sent a visual picture of the relationship between 
acuity differences in the two eyes and job per- 
formance, these “smoothed” percentages for 
each better eye acuity level were subsequently 
plotted. 

Statistical Test of Homogeneity of Sample 
Applied to Each Sub-Group. Fisher (1) dem- 
onstrates a procedure for testing the homo- 
geneity of a sample in a 2 X n’ classification 
which is applicable to the data used in this 
investigation. This test of homogeneity was 
applied to all sub-groups, both for the near and 
far tests, where the total N was sufficiently 
large to make the test applicable. This re- 
sulted in an elimination of only those sub- 
groups where the better eye acuity score was 
3 or less. For all other sub-groups where the 
total for any given difference level was less than 
15, the values were combined with the succeed- 
ing difference groups to bring the total to the 
required 15. In determining the proportions, 
calculations were carried to seven decimal 
places, as Fisher states (1: 91) that using five 
decimal places the value of x? is not quite 
correct in the second decimal, and to avoid 
doubts as to the precision of calculation, two 
more places are desirable. . . 

All x? values determined after applying this 
test of homogeneity were interpreted in terms 
of P from the values given in Fisher's Tables 
(1: 104-105), and subsequently each x? value 
was compared to the .05 probability values as 
given in the Tables. This comparison was 
made because Fisher states (1: 82): "We shall 
not often be astray if we draw a conventional 
line at .05, and consider that higher values of 
x? indicate a real discrepancy. 


Results 


Relationship between Better Eye Acuity and Job 
Performance. The relationship existing be- 
tween better eye acuity and job performance 
is reflected by the percentage of individuals at 
each score level who were in the high criterion 
group of employees. The basic relationship 
represented by these percentages is shown 


100 
a 
5 
3 
g 
275. 
z 
$ 
Ei —7 
t= Jae 
$50 g 
a r 
$5 
= 
z — NEAR ACUTY N:Si0) 
5 25 eee FAR ACUITY M13084 
a 
S 
c 
a 
"5 9s 123 4 5 6 7 8 9 10 tl I2 I8 14 I5 
ACUITY SCORE FOR THE BETTER EYE 
Fic. 1. 


Relationship between better eye acuity and 
job performance. 


graphically in Figure 1, which presents the 
relationship for all of the 92 jobs combined, 
both at the near and far acuity distances. 
Both the far and near acuity, better eye, 
trend lines show a relatively marked increase 
in the percentage of high criterion group em- 
ployees at each successive better eye score 
value. Examination of Figure 1 shows that 
for all 92 jobs combined there is a general in- 
crease in the proportion of individuals who 
were high criterion group employees at suc- 
cessively higher scores, and that the increase 
is quite persistent throughout the entire range 
of scores, though perhaps a little more marked 
in the lower score range. The few irregulari- 
ties in the general trend line are probably due 
to sampling errors, for even though the total 
N is large, the number of employees scoring in 
the 0 to 3 and 13 to 15 score levels are relatively 
small. The general trend presented here 
graphicaly would indicate that at least a 
moderate degree of visual acuity in one of the 
two eyes, both at near and far distance, is 
necessary for satisfactory performance on the 
types of industrial jobs covered by this in- 
vestigation. Further, it is indicated that the 
greater the degree of visual acuity in the better 
eye the greater the probability 
the employee on these jobs, 
The general relationship existing between the 
better eye acuity and job performance, as in- 
dicated in Figure 1, is similar to the relation- 
ship determined between both eye acuity and 
job performance (5). This is true both for the 
near and far acuity distances, 


Over-All Relationship between Acuity Differen.. 


of success for 


Newell C. Kephart and Joseph M. Mason 


ces in the Two Eyes and Job Performance. The 
relationship existing between the visual acuity 
differences in the two eyes and job pe 
is reflected by the percentage of employees 2 
each difference level who were in the ud 
criterion group of workers. The basic re ri 
tionship represented by these percentages ïé 
shown graphically in Figure 2. In this a 
is presented the relationship for all of ee : 
jobs combined, both at the near and far aculty 
distances. the 
The acuity differences represented cover i 
entire range of acuity differences between a 
better eye and the worse eye. Thus, at “he 
given score difference on the base line, the 
value plotted directly above it represents af 
percentage of the total group having rion 
acuity difference who were in the high crite the 
group of employees. For example, above re 
point labeled “7” on the base line, are se 
sented all employees showing a difference i 
score points between the better eye ‘would 
score and the worse eye acuity score; Hs is eye 
include those who scored 7 on the Le who 
and 0 on the worse eye, as well a$ thos' worse 
scored 14 on the better eye and 7 on the ce 
eye, and all others whose acuity differen core 
tween the better and worse eyes Was 
points. eS) 
For acuity differences between the we 
there appears to be no relationship E icat 
these differences and job performance; md he 
ing that with greater differences betW° ilitie 
two eyes (both near and far), the proba jobs 
of satisfactory performance on indust mate 
of the types investigated here are app? 


100 


75 


Spee Es i 


510! 
L— near ACUITY MT 
wee FAR AGUITY 


PER CENT IN "HIGH" CRITERION GROUP 


12 

M ove? 
Ol oe s 1 8 9 e rwo EY $ 
ACUITY DIFFERENCES BETWEEN T m di 
sy dil 
Fic. 2. Over-all relationship between ano 


nce- 
in the two eyes and job performa 
` 


Acuity Differences between the Two Eyes and Job Performance 


ly the same as the probabilities of satisfactory 
performance where the two eyes are equally 
good. The general trend presented here graph- 
ically indicates the over-all relationship for 
acuity differences between the two eyes and 
job performance, and suggests that such acuity 
differences do not, at any difference level, sig- 
nificantly affect the probabilities of satisfactory 

Performance for the combined group of jobs. 
Relationship between Acuity Differences in the 
Two Eyes and Job Performance for Each Better 
Eye Acuity Level. For a given better eye acu- 
ty, the relationship existing between acuity 
differences in the two eyes and job performance 
1S reflected by the percentage of employees at 
each difference level who were in the high cri- 
terion employee group. The basic relation- 
ships are presented graphically for better eye 
acuity scores 3 to 15 in Figures 3 to 6, inclu- 
D These figures represent the relationships 
Or all of the 92 jobs combined, both at the 

near and far acuity distances. 

Mas any given better eye acuity, the apparent 
= of relationship from the low acuity differ- 
Ces to the high acuity differences indicates 
at such differences do not significantly affect 
or probabilities of satisfactory. performance 
ber these jobs, Where the acuity difference 
tween the two eyes is large, the probability 
ap Satisfactory performance on these jobs is 
phan the same as the probability of 
is 54 actory performance where the difference 
all [p This general relationship holds for 
etter eye acuity levels, both near and far. 
a of Statistical Test of Homogeneity of 
sam m Fisher's test of homogeneity of à 
each, s h a 2 X n’ classification, m 
Score Uüb-group for each better eye acul y 
Mal Substantiates the basic relationships 1n- 
the ^d in the graphic presentations. None of 
ig proportions of high criterion employees 
Motes Biven better eye acuity kx d 
Portig S significantly from the theoretical pr : 
n for the group as a whole. The X 


Ta]? reduce pri d 6, and 
le. Uce printing costs, Figures 3 4, 5, and 9, | 
Dos and 2 have jd deposited with the A 
tion Institute and may be ordered as 


j S 870 from American Documentation Institt™, 
0 rican | itting 
3590 treet, N.W., Washington 6, D? E "standard 


mm microfilm (i 1 inch hig 
(6 X hs Motion picture fim) or 80,0 for photocopie 
Ches) readable without optical aid. 


427 


values obtained, the P values interpolated from 
Fisher's Tables (1:104-105), and the x? values 
for a P of .05 are presented in Tables 1 and 2,4 
for far and near acuity distances. Fisher 
states that if P is between .1 and .9 there is 
certainly no reason to suspect the hypothesis 
tested, or that the observed deviations could 
not have occurred by chance. We have there- 
fore, accepted his suggestion and hold that only 
where P is smaller than .05, is there sufficient 
indication of a significant deviation. None of 
the P values for any better eye acuity level, 
near or far, is as small as .05; the range for the 
24 sub-groups is from .09 to .82, with 23 of the 
24 falling within the .1 to .9 limits mentioned 
above. 

This statistical test then confirms the ob- 
served relationship between acuity differences of 
the two eyes and job performance, at constant 
better eye acuity levels. For the 92 industrial 
jobs combined, as studied in this investigation, 
acuity differences are not related to probabil- 
ities of satisfactory job performance. 


Summary 


An investigation was made of the existing 
relationship between certain visual skills and 
job performance on a variety of industrial jobs. 
Ortho-Rater scores of approximately 5,600 em- 
ployees on 92 different industrial jobs com- 
prised the basic data. Using various measures 
of job performance as criteria, about 5,100 of 
these employees were identified as either high 
criterion group employees or low criterion 
group employees. Four measures of visual 
acuity were given particular attention and 
their relationship to job performance analyzed. 
These four acuity measures were: better eye 
acuity, near; better eye acuity, far; acuity dif- 
ferences between the two eyes, near; and acuity 
differences between the two eyes, far. Acuity 
differences between the two eyes were analyzed 
both for an over-all relationship to job per- 
formance, and for each better eye acuity level. 

The following conclusions are drawn from 
this investigation: 

1. For jobs of the type studied, a moderate 
minimum of visual acuity in one eye or the 
4 See footnote 3. 


428 


other is necessary for satisfactory job perform- 
ance. 

2. Beyond this minimum, probability of 
satisfactory performance increases quite con- 
stantly with greater degrees of acuity in the 
better eye. 

3. In the over-all relationship between acu- 
ity differences in the two eyes and job per- 
formance, such differences are not significantly 
related to job performance. 

4. A horizontal straight-line relationship 
exists between acuity differences in the two 
eyes and job performance, so that the probabil- 
ity of satisfactory performance where acuity 
differences are large is approximately equal to 
the probability of satisfactory performance 
where acuity differences are negligible. 

5. For any given better eye acuity level, 
acuity differences in the two eyes do not affect 
the level of job performance for the types of 
jobs studied. 

This investigation has been primarily con- 
cerned with the over-all relationships for the 
92 jobs combined. It is possible that on 
certain industrial jobs that require particular 
types of visual skills, acuity differences in the 
two eyes may be related to job performance, 
An extension of this investigation to employees 
on particular jobs would be necessary to de- 
termine these relationships. 


Received January 23, 1950. 


N 


~ 


10. 


1. 


12. 


- Guilford, J. P. 


Newell C. Kephart and Joseph M. Mason 


References 


. Fisher, R. A. Statistical methods for research work- 


(4th Ed.) London: Oliver and Boyd, 1932. 
Fundamental statistics in psychology 
and education. New York: McGraw-Hill Book 
Company, Inc., 1942. 


ers. 


. Kephart, N. C. An analysis of professional eye 


care and industrial efficiency. Trans. Amer. 
Acad. Ophthal. Otolaryng., 1946, 51, 166-170. 


- Kephart, N. C. Visual skills and labor turnover. 


J. appl. Psychol., 1948, 32, 51 


. McCormick, E. J. An analysis of visual require 


ments in industry. Ph.D. Thesis, Purdue Uni- 
versity, 1948. 


. Stump, N. F. Vision tests predict worker Et 
bility. Factory Mgmt. and Maint., 1946, 104, 
121-124. — 

. Tiffin, J. Industrial psychology. (2nd Ed.) New 
York: Prentice-Hall, Inc., 1947. . 

- Tiffin, J. The use of visual data as an aid to 


Trans. 


increase production and efficiency- j4 


Amer. Acad. Ophthal. Otolaryng., 1944, 49, 


. Tiffin, J., and Greenly, R. J. Employee selection 


tests for electrical fixture assemblers and pus 
assemblers. J. appl. Psychol., 1939, 23, 
263. isual 
Tiffin, J., and Wirt, S. E. The importance of viS, 
skills for adequate job performance. J. con 
Psychol., 1944, 8, 80-89. 
Wirt, S. E. Statistical laboratory for v 1946, 
u Purdue University. J. appl. Psychol», 
, 354-358. I 
Standard practice in the administration of the yn 
and Lomb occupational vision tests illt Hie any» 
Rater. Bausch and Lomb Optical Comp" 
Rochester, N. Y., 1944, 


ision tests 


Point Centering of Signals on an Area * 


Adelbert Ford, David Rigler, and Genevieve E. Dugan 


Lehigh University 


This task relates to the precision with which 
Signals, consisting of white spots or patches, 
can be located or “tracked” on a dark field, 
Such as that of a radar scope, by means of a 
device by which an operator can move a point 
cursor to the center of the signal, using a 
pantograph lever system giving two degrees 
of muscular freedon, and experimental amounts 
of movement reduction. 

The problem includes the discovery of the 
absolute amount of the error, its comparison 
with the sizes of errors previously reported for 
Scale reading techniques of locating signals, 
the presentation of any tendencies towards 
drifts and other variability in efficiency, and 
Some possible causes for the trends. 

The necessity of precise signal location on 
Scope faces is a problem of human operation 
Which is critical in the technical methods of 
obtaining navigational information for the use 
of ships and commercial aircraft traveling in 
congested regions at night and in weather con- 
ditions adverse to good visibility. 


History of the Problems 


A signal can be located on a scope face by 
referring it to a system of scaling lines. The 
configurational and systematic errors of this 
procedure have already been reported by Ford 
(5, 6) and by Reed and Bartlett (9). The 
average random error for this method has been 
shown to be about 0.020 inch of scope distance 
for clear signals, and about 0.040 inch for 
typical unclear signals. The systematic errors 
are too great to be tolerated. Individual 
differences are relatively small. 

Signals can also be located by a superimposed 
Cursor which actuates a remotely controlled 
indicator. In this situation there are two 

uman error problems: (a) how accurately the 
Operator sets the cursor on the signal of “pip,” 
and (b) how accurately he reads the indicator 
Wa This research was executed under Contract. No. 
3-038-ac-22561 between the Institute of Research, 
chigh University, and the USAF Air Materiel Com- 
mand, Aero Medical Laboratory, Wright-Patterson Air 
Orce Base, Dayton, Ohio. 


dial. The first of these problems will be par- 
tially answered in the present article. With 
regard to the second, Chapanis (2, 3) and Long 
(7) have shown that a direct-reading counter 
is better than a scale with an indicator needle 
when reading the numerical value of the posi- 
tion of a signal which is already established 
as to location, but not for the purpose of mov- 
ing an indicator to a specified numerical 
position. 

Most of the war instruments used cursors 
with hair lines for centering the x- and the 
y-axes by separately controlled hand wheels. 
The use of a pantograph lever system for 
moving a pointer cursor with two degrees of 
muscular freedom was adopted by Reed and 
Bartlett (8) on PPI traffic survey scopes, and 
although they did not have a precision method 
for registering errors, they reported the center- 
ing procedure to be about twice as precise as 
the scale-reading method. This report is in 
the direction of our own findings on the sector 
scopes. In using cursors on changing target 
positions Ellson and Wheeler (4) reported the 
“range effect," which is a tendency of the oper- 
ator to “overshoot” when the movement of a 
target position is less than expected, and to 
*undershoot" when the movement is greater 
than expected. Bartlett and Sweet (1) noted 
that when two operators were simultaneously 
tracking the same target their agreement in 
readings became poorer and poorer as the 
successive target positions moved more rapid- 


ly. 
Apparatus 


Figure 1 is a reproduction of an artificial 
signal (the oval area in the center) surrounded 
by eight station positions, actual size. In 
some of the experiments a photographic repro- 
duction of a real radar “pip” was substituted 
for this artificial signal, at approximately the 
same size of signal area. 

The cursor was a black dot the size of a 
No. 60 B and S drill inscribed on the under side 
of a transparent plastic plate. This cursor 
plate could be moved from any specified station 
position to the center of the signal by means of 
a pantograph system of levers such that experi- 


429 


430 


8 o 
LJ 
T © 
LJ 
6 2 
Fic. 1. Artificial signal, the oval area in the center. 


Numbered dots are the station positions from which the 
operator moves the point cursor (actual size). 


53 
4 


mental ratios of knob movement to cursor 
movement could be established at values of 
1:1, 2:1, 4:1 or 6:1.! The vertical error of 


A. Ford, D. Rigler, and G. Dugan 


The target board was mounted thirty degrees 
from horizontal and illuminated with thirty 
foot candles of well diffused tungsten and 
fluorescent light. 


Experimental Procedure 


Method of Average Error. In this series the 
subject made his own settings of the cursor on 
the target. The order of the various condi- 
tions was determined by the Latin Square 
arrangement to equate fatigue and training 
effects from one condition to another. . b 

The statistical treatment is determined by 
the fact that we discovered that all subject? 
tended to show “drifts” up and down for wha 
they considered the center of the target arca. 
A method of running averages and running 
standard deviations was adopted. 


Table 1 


Point Centering Experiments, Artificial Pip: Subject WEM 
(Values expressed in decimal fractions of one inch) 


Time 

" imit Stand Ave. Max. Dev. i 
Kind of f in Dev. Running Running pu 

Tracker Ratio Seconds Total S.D Average m 

Pantograph 6/1 2 .0038 .0026 .0088 "€ 
Pantograph 4/1 2 .0026 .0020 .0045 ae 
Pantograph 2/1 2 0027 0024 0029 " 
Fing. Cent. 1/1 8 0028 0020 “0065 2 
Fing. Cent. 1/1 6 10031 ‘0024 “0053 Da 
Fing. Cent. Vt 4 0031 0025 “0049 T 
Fing. Cent. 1/1 2 .0036 .0028 “0028 » 


Discrimination Limen (Weber's Law), S.D. — .0010. 
S.D., Reading from Scale A — .0166. 


the centering was indicated by an attached 
rotating mirror which moved a light beam alon 
a kiprizonial scale at a magnification of 100 and 
measured variations in settings to a precisi 
of 0.001 inch. precision 
The signal was covered by a solenoid 

à -actu- 
ated shutter. The subject could open X 
shutter at his own discretion, but the shutter 
would close after an experimentally determined 
number of seconds allowed for centering of the 
target. 


1 For an expanded description of th 
additional tables and charts of results, see Ford, A. 
Rigler, D. and Dugan, G. E., Pantograph Rada? 
Tracking: Point Centering Experiments, USAF T, = 
m He por No D Ar Patel Command. Aca 
Medical Laboratory, Wright-Patterson Air Force 
Dayton, Ohio. S HS n Air Force Base, 


€ apparatus, and 


ice. 
Twenty-four trials were allowed for practicos 
hen about 180 settings were made 1n a $C pe 
under a given experimental condition. E 
average and the standard deviation W9 frst 
puted for the first 24 readings. Then the ngs 
eight were dropped, and the next eight rea! hard 
added, and the next average and $ angry 
deviation noted. This was done succe yere 
until the last 24 readings out of the 18 peen 
exhausted. Table 1 and Figure 2 have aver 
esigned on this kind of data. Then t M 
"x and the standard deviation of th* 
es was computed. "m 
, The drift phenomenon required quant pich 
tion, so we established a ‘drift index ave 
was the maximum change in the runnin! adar 
age divided by the average running star age 
eviation. If the operator’s running a 


Point Centering of Signals on an Area 


shifted from a —0.003 to a +0.003 inch, and 
his average running standard deviation was 
0.002 inch error, his drift index would be 3.0. 
This example is fairly close to what actually 
happened. (See Figure 2.) 

The Method of Constant Stimuli. Where the 
subject makes his own attempt to place the 
cursor in the center of a signal area one may 
consider three causes for the commission of an 
error: (a) the subject hasn’t the motor skill to 
make a precise positioning of the cursor point, 
(b) the subject hasn't the fineness of visual 
perceptual discrimination to see where the 
center is, or (c) the subject lacks an attitude 
of working to precision and works only to the 
standard he thinks will satisfy the supervisor. 

To get evidence on the second of the above 
factors, the experimenter set the cursor point 
in chance order above and below the actual 
center, and the subject reported "higher" or 
"lower" in a typical Weber-Fechner technique. 

he experimenter started with a plus-or-minus 
0.004, then 0.003, and so on down to 0.001 inch 
differences. For this purpose the difference 


PANTOGRAPH, 2 SEC. 


KSNW MOVING AVERAGE 
MN 


QQ” 


MOVING STAND. DEV. 


NN 
RATIO: 2/1 


010 

a 

W005 

2 000 

2.000748 72 96. 
READINGS 


Fic. 2. Chart of error trend 


431 


limen was considered to be that value which 
the subject reported correctly by an amount 
equal to the standard deviation of the proba- 
bility curve (68 per cent instead of the usual 
50 per cent). The value of this limen, for 
WEM, is given at the bottom of Table 1. 
Such a value for each subject was assumed to 
yield a measure of the pure perceptual factor 
uncontaminated by the factor of motor skill. 

Scale Reading Comparison. Using a set of 
photographic reproductions of artificial pips of 
the same dimension as that shown in Figure 1, 
superimposed on the multiple scaling system 
like that used in previous experiments which 
yielded the highest accuracy, each subject was 
required to estimate elevation of the signal 
above or below a line of zero reference, by the 
use of scaling lines only. Ninety-six readings 
were taken for each subject. The value is 
indicated under Table 1 after “S, D., readings 
from Scale A." This was done to show the 
difference in precision between scale-reading 
method and cursor-setting method, with indi- 
vidual differences held constant. 


FINGER CENTERING 


SCALE FOR 
ABOVE 


[5] 
Z.00072738—72 96 120 
READINGS 


5 for Subject WEM on artificial signal. 


432 


Results 


1. Subjective Drift. The tendency to wobble 
back and forth, above and below, the mean 
center, with the majority of readings too high 
for a while, then too low for a while, is pictured 
in Figure 1, for WEM. All seven subjects did 
this to some degree on both the artificial signals 
and the reproduction of the natural signal. 
The drift index was as much as three for every 
subject on one or the other of the runs. We 
are inclined to believe that this tendency to 
wobble around a center is characteristic of 
nearly any difference limen experiment by the 
method of average error, but has been hereto- 
fore hidden by the use of gross averages and 
gross measures of dispersions About 0.003 
inch wobble is the expectancy for this size of 
signal. 

2. Pantograph Reduction Ratio. The best 
reduction ratio for hand movement in relation 
to cursor movement was close to 2:1. This is 
shown pictorially in Figure 2 for WEM, and 
in Table 1. The criteria for making this 
statement are: (a) least amount of drift, (b) 
smallest average running standard deviation 
of errors, and (c) smallest total standard 
deviation for errors. On the artificial signals, 
two out of the five subjects did almost equally 
well on the 4:1 lever ratio. On the reproduc- 
tions of natural signals all three subjects were 
superior on the 2:1 ratio. Since the last 
situation is closer to the job requirements, the 
vote must go to the 2:1 ratio. 

3. Comparison to Scale Reading. Using the 
total standard deviation (which includes vari- 
ability due to drift) as the criterion for cursor 
centering by the 2:1 pantograph ratio, all 
five subjects showed a heavy superiority for 
pantograph centering as compared with the 
technique of scale reading, Expressing the 
standard deviation of cursor centering errors 
as percentage of the scale reading errors, we 
obtained the following values: 16, 46, 35, 44, 
27. Over all, this means that the expectancy 
is that pantograph centering is about three 
times as accurate as reading the same signal on 
a superimposed scale (with much les 
effort). If a method of reducing drift could be 
found, the advantage would be still greater, 

4, Hand Positioning. Allowing the subject 


S mental 


A. Ford, D. Rigler, and G. Dugan 


to rest the palm of his hand on the geet 
making a final ecntering resulted in a slight y 
superior score, statistically, but the mamm 
was very small and of doubtful statistica 
significance. Subjects became rapidly ert 
nistic to the compulsory free arm nie 
without the palm rest, however, and reporte 

that it was a strain to do it in this manner. 

5. Preceplive versus Molor Causes of Error. 
The limen of perceptive discrimination, “a 
indicated by the method of constant stimu i 
is only a half to a third the value of the tota 
standard deviation of cursor centering ee 
This means that perceptive errors account E 
only a small part of the total error by the Pig 
tograph centering technique. We may pro : 
bly make a large allowance for motor skills 2? 
precision attitude. g P 

6. Effect of Time Restriction. A point e: 
can be centered on a signal in a time as $ au 
as two seconds before any appreciable y 
in error is generated. This was true 9'* 
eight subjects. 

7. Type of Cursor. 
was surrounded by rings a tenth of jency 
apart, like a “bull’s eye,” the drift ten¢ eo 
was reduced for all three subjects 9n the 
natural signal series. ‘This suggests that ot 
problem of drift can be solved by CU" 
design. 

8. Absolute Size of Errors. à 
ard deviation of pantograph centering 
for the 2:1 lever ratio (which includes 
drift tendency) for the five subjects 01027, 
artificial signals was a plus-or-minus ‘or an 
0.0047, 0.0050, 0.0063, and 0.0026 inch ^t an 
average expectancy of four thousandths 5 
inch for a signal of this size. The value the 
the centering of the reproductions 0,0050: 
natural pips were 0.0068, 0.0070, and ndth5: 
or an average expectancy of six gani hat 
Previous work by Ford (5, 6) has show! + 0 


int cursor 
When the point cur 
an inc 


d- 
r tal sta" 
The to error? 


the 


natural signals are about twice as di s vith 
localize on a surface as artificial sign? 
sharp contours, 

Summary of 


ns 
1. Centering signals on an area by ES Ww 
à Pantograph-controlled cursor is P? jon o 
to four times as precise as the ipa 
such signals by scale reading methods: 


| 


Point Centering of Signals on an Arca 


2. Using a point cursor, the subjects tend 
to exhibit systematic drift back and forth 
across the signal center. 

3. A reduction ratio of two units of hand 
movement to one unit of cursor movement is 
hear the optimum, 

+. The use of a palm rest resulted in a slight 
Superiority of scores in centering, and was 
greatly preferred by all subjects. 

_ 5. Centering in two dimensions can be done 
a minimum of two seconds before reaching 
definite increases in error. 

6. The limits of visual perception account 
for less than half of the range of errors. 
Skill and precision attitude may account for 
the remainder. 

7. A point cursor surrounded by “bull’s 
€ye" rings reduces the tendency to drift around 
actual center. 

8. With a signal 0.261 inch in length, the 
Standard expectancy in centering is a plus-or- 
minus error of 4 thousandths of an inch for 
artificial signals, and six thousandths of an 
inch for reproductions of real radar pips. 
Individual differences are relatively large. 


Received February 9, 1950. 


References 


1. Bartlett, N, R., and Sweet, A. L. The agreement in 
range and bearing reports from two VF Radar 
Repeaters tracking the same target simultaneously 
with the B Scope. Memorandum Report No. 


. Ellson, D. G., and Wheeler, L. 


. Ford, A. 


433 


166-I-12, Special Devices Center, U. S. Navy, 
1 November 1946,-restricted. Johns Hopkins 
University. 


- Chapanis, A. Speed of reading target information 


from a direct reading counter ty pe indicator versus 
conventional radar bearing and range dials. Mem- 
orandum Report No. 166-I-3, Special Devices 
Center, U. S. Navy, 1 November 1946, restricted, 
Johns Hopkins University. 


. Chapanis, A. The relative efficiency of a bearing 


counter and bearing dial for use with PPI Presen- 
tations. Memorandum Report No. 166-1-26, 
Special Devices Center, U. S. Navy, 1 August 
1947, restricted. Johns Hopkins University. 

The range effect. 
Technical Report No. 5813, May 1949, Air 
Materiel Command, Wright-Patterson Air Force 
Base, Dayton, Ohio. University of Indiana. 


. Ford, A. Types of errors in location judgments on 


scaled surfaces. I. Errors of configuration. J. 
a ppl. Psychol., 1949, 33, 373-381. 

Types of errors in location judgments on 
scaled surfaces. II. Random and systematic 
errors. J. appl. Psychol., 1949, 33, 382-394. 


. Long, G. E. Speed and accuracy of readings as a 


function of design in the sensitive airspeed indi- 
cator. Technical Report No. 5836, August 1949, 
Air Materiel Command, Wright-Patterson Air 
Force Base, Dayton, Ohio. 


. Reed, J. D., and Bartlett, N. R. Comparison of 


manual and standard methods of larget identifica- 
tion. Memorandum Report No. 166-I-9, Special 
Devices Center, U. S. Navy, 1 February 1947, 
restricted. Johns Hopkins University. 


. Reed, J. D., and Bartlett, N. R. The accuracy of 


range information on a PPI with respect to the 
distance of the target from the range rings. Memo- 
randum Report No. 166-I-29, Special Devices 
Center, U. S. Navy, 1 November 1947, restricted. 


Johns Hopkins University. 


Influence of Friction in Making Settings on a Linear Scale * 


William Leroy Jenkins, Louis O. Maas, and David Rigler 
Lehigh University 


Ina previous study of the process of making 
settings on a linear scale by means of a control 
knob (1), a number of variables were investi- 
gated. The most significant of these turned 
out to be the ratio between pointer-movement 
and knob-turn. A ratio providing one or two 
inches of pointer-movement for one complete 
turn of the knob appeared to be optimal. 
Finer ratios wasted time and effort in traveling 
to the approximate location. Coarser ratios 
were poorly adapted to making the final ad- 
justment. 

In the apparatus employed for these studies 
there is a noticeable resistance with extremely 
coarse ratios. The question arose: Would the 
optimal ratio be changed by artifically equal- 
izing the friction at all ratios? The present 
study is concerned with this question, and 
also with the more general problem of the 
influence of added friction at the optimal ratio. 


Apparatus and Procedure 


The apparatus and procedure were essen- 
tially the same as described in the previous 
study (1). The subject matches the position 
of a lighted insert in a black bakelite scale 
with a pointer controlled by a rotary knob. 
The permitted error-tolerance is determined by 
the width of the pointer in relation to the 
width of the lighted insert. In the present 
investigation, the permitted error tolerance 
was .007”; (insert 032”; pointer 025”), and 
the knob diameter 234", , 

By means of two chronoscopes, time is 
measured separately for travel t 
imate location and for making the final adjust- 
ment. Similarly, action potentials from the 
active forearm are accumulated and measured 
separately during travel and during final 
adjustment. Mean travel time is computed 
for two standard distances: 10 sixteenths and 


© the approx- 


* This research was executed 
W33-038-ac-22561 between the ieee pees No. 
Lehigh University, and the USAF Air Materiel Grae? 
mand, Aero Medical Laboratory, Wright-Patt "s 
Furce Base, Dayton, Ohio. SISOnc Air 


50 sixteenths of an inch. Mean total time 15 
then computed as mean travel time plus mean 
adjusting time. Similarly computations are 
made for action potentials, which are recorde 
in meter-scale readings having no intrinsic 
significance but comparable for the same 
subject. 

In the present experiments, friction Was 
added by means of a Prony brake on the 
shaft immediately behind the control knob. 
For the series on equalized friction, the braka 
was adjusted at each ratio so that a pull of 3 d 
grams was required at the periphery of pe 
234" knob—the same as for the coarsest pe 
used. For the series on the influence of e 
tion at the optimal ratio, sufficient prio 
was applied to require pulls of 100, b 
1,000, and 1,300 grams at the periphery 9 
knob. 

Four subjects were used, 
students. One of them (DMS) 


in several previous experiments. 
three were new. 


n 
all young me 
had serve 
The other 


Results—Equalized Friction 


è ean 
Table 1 shows mean total time and houi 
iL 


total potential at six different ratios W" - 
equalizing friction. These results are ? 
ogous to those reported in the earlier stu y 
The optimal ratio is in the region of one 
inches of pointer movement for one put of 
turn of the knob. With a travel dist@P js 
10 Sixteenths of an inch, the ratio © 
optimal. When the travel distance }§ >, 
teenths of an inch, the ratio of 2.42 iS e 
better. m' 

Table 2 gives mean total time and with 
total potential for the same six an cn 
friction artificially equalized. At 10 p 100% 
travel distance, there is now little t° et 
between ratios 118and 2.42. At 50 n sif 
travel distance, the ratio 2.42 becomes 
icantly better, 

The two sets of data shown in Table 
Table 2 were taken at different times: 
absolute values cannot be directly © 


tw? 


aP 
il “pe 
50 4, 

are 


434 


Influence of Friction in Settings on a Linear Scale 


435 
Table 1 
Influence of Ratio on Time and Potential—Friction Not Equalized 
Mean Total Time 
10 Sixteenths Travel 50 Sixteenths Trável 
Ratio DMS RBC BRR JMC DMS RBC BRR JMC 
1.18 16.3 26.8 24.8 26.7 243 34.0 30.4 33.5 
242 191* * 275 25.3 32.6* 2T. I* 31.5 29.3 36.6 
4.08 19.2* 30.9* 264 35.2* 23.6 34.1 29.6 38.8* 
6.28 19.5* 33.8* 28.8* 33.9* 23.5 35.8* 32.0 36.7* 
9.70 23.8* 34.8* 31.6* 45.2* 26.6 372* 34.8* 48.4* 
16.3 32.8* 48.8* ST 92.5* 34.4* 50.4* 39.7* 56.1* 
Mean Total Potential 
10 Sixteenths Travel 50 Sixteenths Travel 

Ratio DMS RBC BRR JMC DMS RBC BRR JMC 
1.18 144 18.1 19.7 21.5 23.2 25.7* 28.5* 34.7* 

2.42 [7.1* 18.1 18.1 214 25.1 217 22.9 27.8 

4.08 16.5* 21.0* 19.6 23.7 21.3 24.2* 23.6 29.7 

6.28 18.1* 24.9* 23.6* 26.0* 224 27.7* 28.4* 30.0 
9.70 19.7* 23.8* 27.0* 36.6* 23.3 26.2* 31.8* 41.0* 
16.3 24.9* 38.1* 41.1* 45.1* 26.9* 40.9* 44.3* 49.5* 


Tolerance: .007", Knob diam.: 234”. 
* Significant difference beyond 1% level from italicized figures. 


Table 2 


Influence of Ratio on Time and Potential Friction—Equalized at All Ratios 


Mean Total Time 


10 Sixteenths Travel 50 Sixteenths Travel 
Ratio DMS RBC BRR JMC DMS RBC BRR JMC 
1.18 26.8 27.8 20.2 404 41.2* 44.2* 28.6 55.25 
242 249 — 266 — 23 A8 313 — 346 — 265 — 456 
4.08 28.0 29.8 219 434 344 — 316 — 285 — 494 
6.28 30.3* 30.8* 24.7* 47.5* 35.4* 35.2 27.7 52.3 
9.70 31.9* 35,1* 27.4* 54.9* 36.5* 38.3 30.0* 59;3* 
16.3 d " 386^ 358° © 709" 474* — 418* —37.8* 73.3* 


Mean Total Potential 


50 Sixteenths Travel 


10 Sixteenths Travel 


DMS RBC BRR JMC 


Rati BRR | JMC 
: u DMS RBC s 310 42.2* 36.4* 33.5* 46.0* 
D ! : . 28. : 
zo mo n» ga Bi x» 5: m5 
1.08 d 2m 19.7 Sd 339 203 27.1 39.1 
6.28 eed ss 219 a 353 37.0" 927.7 463* 
"d us de nr ne 488* — 331 346* — 6L7* 
: ; 31. à 


45.2* 29.9* 


Tole 7 
* tance: 0077 diam.: 224^- salicized figures. 
Significant thin eer 1% level from — 


. L. Jenkins, L. O. Maas, and D. Rigler 


W 


436 


"(urn jsnfpv uo oni jo oouangu cz “OT 
Inra yygo 08H swoon 
o 
loit 
0I Éi 
D 
o o 
o 9 d ? ce) 
o B. BY igg E g 9 9€ 
o ov o ¥ 
S 
[o] ^ 
a 8 ü ^v 4 
v joe Vv 
o A lo 
i a 
A 
= a 
a 4 
a ovr g 
e 
ô 
Jost 
aazınyno3 
aaziqvno3 {o9 10N 
ex NOIL21N3 NOILOINS 
cd cm cc 
5 9 90 ^m» - »7" 923 55 
u òg 9 x» ? "o 8 @ 8 ® 
SOlLvy 


SQNO93S HLN31 NI 3WIL 


auy [PATI] UO oni jo o2usngu] pA 
Ira aygo 984V swoo 
—,8 
on On g o D o - 
EEF Ao. © o & g ed oe 
A ^ ^ eg g a 
= q0} ò 
aazinvno3a 
aazivno3 | LON 
NOI12183 ozp NOI12183 
-SH1N331XIS OI - SH1N331XIS OI x 
z 
m 
- = 
o o o ^ m lI o o o > m = z 
“o> 8 & se eo g 3 R?d 
z 
E 
o uo 
m 
o 
mo o Ei 
o o a w 
o Wag o ol v F de bd & 
av 4 os 
A dx x At & 
wA O a ao A 
Q3azivno3 
03z!1vno3 oz| 10N 
NOI12183 a NOI12183 
-SHIN331XIS OS ^ -SH1N331XIS OG 
v 
o 1 o + m = o L o E m = 
»e BSR S 8eBs & ® 
solivuy 


437 


'[euojod uo uopouy jo o»uonpug “p'O 


INA Yygo 2847 swdo 
= -9 o 
EAD: 
E QE 
Ei o 19! SE ol 
e 9 & o7 
E Q WE a us UN SH1N331XIS OI v 
3 v o 13AVU1 E 
SS a = m 
DEI oz no ee z 
[1 62906 8 =] 
s - S a o o 9 > 
S v 
^ ^ E 
F3 a 4^ 1es SS = 
* z 
=~ m 
S 4 
A SH1N331XIS OG m 
R m 
3 TBAVUL 
- Ov n 10! m 
S po 
3 an 
E 9 
x u $ 
m os o2 0 
8 1snrav os 
S o 
= 
S 09 x or ot 
S Os 
v üG d 
ex 
a 
ol en ae ae te REN 
G a 3 & o e o d £ o 
ages 9 9s 89 eg 3° 
NOII2!MNJ 


INA 


uu8o 


awy uo uononj Jo o»uangu E'A 


9847 swao 


8 8 
EZ 


VA 


SH1N331XIS OI 
VBAVYEL 


c = s > 
uw o o o 
ó 
9.82909 
^ oc 
^ 
A A SH1N331X!S OG 
B 13Av H1 
jov 
los 
isnrov 
loe 
— i aig s i "n 
mox = EB = d = Ww S 
u o ò o 9 e o 6 
o o o o o 
o o € Ss € o > e 


NOIXI21!H84J 


00! 


SONO23S HIN31 NI 3WIL 


oc 


438 W. L. Jenkins, L. O. Maas, and D. Rigler 


Table 3 


Influence of Friction on Time and Potential at the Optimal Ratio (1.18) 


Mean Total Time 


10 Sixteenths Travel 50 Sixteenths Travel 


Friction DMS RBC BRR JMC DMS RBC BRR JMC 
100 22.4 26.1 16.6 40.3 29.2 33.7 22.2 49.5 
400 23.8 23.9* 17.4 39.1 39.0* 40.3* 27.8* 54.3* 
700 24.5 26.3 18.0 45.4* 44.5* 47.9* 30.8* 64.2* 

1,000 23.7 28.0 17.6 47.6* 48.1* 50.2* 32.0* 68.4* 
1,300 25.8* 27.6 18.8* 48.7* 51.0* 51.6* 34.8* fidt 


Mean Total Potential 


10 Sixteenths Travel * S0Sixteenths Travel 


Friction DMS RBC BRR JMC DMS RBC BRR JMC 
100 17.5 17.9 15.1 23.7 24.3 EX 23.9 34.5 
400 22.0* — 170 — 189* 39.9 37.2* — 342* 38.1" — 465 
700 23.0* 20.7* 21.4* 36.9* 49.8* 42.3* 43.8* 56.9" 

1,000 25.1% — 220* — 219* 39,78 55.1" — 442* — 443* — 625* 
1,300 285* — 253*  235*  40.6* 561* — so1* — 523* 66.5" 
Ratio: 1.18. Tolerance: .007", 


tio: Knob diam.: 234", 
* Significant difference beyond 1% level from italicized figures. 


However, it is evident that e 
friction at all ratios does not cha 
of the optimum at short travel 
has only a moderate effect at 
distances. 

Figures 1 and 2 show graphically why this 
is true. In Figure 1, travel time is compared 
with and without equalized friction. Only 
in the drop from 1.18 to 2.42 at 50 sixteenths 


qualizing the added so that pulls of 400, 700, 1,000, wa 
nge the region 1,300 grams were required. ial 
distances and Mean total time and mean total potent? 
longer travel are shown in Table 3. At the shorter Ts 
distance, even the highest level of friction IP 
creases the total time only slightly in three $5 
the four subjects. The effect on total pote? 
tial is more marked. At the longer pe 
travel is there any marked difference. In NE p^ s awing'effect af frictions ogres- 
Figure 2, adjusting time is compared in a gi S even at 400 grams and becomes € i 
similar manner. With or without equalized mig m n iia 
friction, the ratio of 1.18 gives the shortest i epee omen 
adjusting time. Any advanta Bae and 4 siow eo e 


just ge of a higher occu: í ME. ‘ced tbat | 
ratio is, therefore, traceable to the slightly aq oad ae 3 it will kl vec imt 
faster ita i over long distances. In most time, while es T its main e ee aff ected 
ordinary adjustments, the ratio of 1.18 remains at all, r f Justing time is sca. «cd frictio? 
Müuiyemdus, ^ erii bann. ae ate ane iss ol is slightly, though ®° 
cially equ à i evel is slightly, n 
. Significantly, shorter than at 100 grams: able , 
Results—Added Friction at the Optimal Ratio Figure 4, travel potential shows à notice g 
n mer s ay us i 
Normally in the apparatus a pull of Mer fase with added friction, but adj one 


imately 100 grams at the periphe f 
234" knob is required to mo s poii  Ofthe four subjects, 


ve t i i 
By means of the Prony brake, D ole oe wt 
Out added friction at 400 grams 


Potential rises sharply in the case of only 


b. 4 


Jain 


Influence of Friction in Settings on a Linear Scale 


the higher level distinctly irksome even when 
no marked slowing of settings is evident. 


Summary 


Previously it had been shown that a ratio 
of one or two inches of pointer movement for 
one complete turn of a control knob was 
optimal for matching settings on a linear scale. 
In the present study, friction was artificially 
equalized at all ratios by means of a Prony 
brake. Even with equalized friction, the 
optimal ratio was unchanged for short travel 
distances. With large amounts of travel be- 


439 


fore setting, the optimal ratio was slightly 
increased. The influence of added friction at 
the optimal ratio was also studied. Added 
friction, even in excessive amounts, has no 
effect on adjusting time but does increase 
travel time and the action potentials from the 
active forearm. 


Received February 9, 1950. 


Reference 


1. Jenkins, W. L., and Connor, M. B. Some design 
factors in making settings on a linear scale. 
J. appl. Psychol., 1949, 33, 395-409. 


The Galvanic Skin Response as a Test of Advertising Impact * 


Edwin Golin and Samuel B. Lyerly 


The University of North Carolina 


In preparing an advertising campaign, the 
advertising agency is usually faced with the 
job of selecting the most effective layout for 
publication. The copy department submits 
several layouts concerning the product to be 
advertised and from this group, one arrange- 
ment with the greatest potential "impact" 
upon prospective consumers must be chosen. 
A misjudgment in this selective process may 
involve a large loss of capital, time, and energy. 
To avoid this waste, many techniques have 
been employed in an attempt to make the 
selection less haphazard. Some of the methods 
used are subjective judgments of large samples 
of population, experts’ judgments, coupon 
returns, and so on. However, none of these 
methods has proved completely satisfactory 
because of either low validity or the relatively 
great expense and time factors involved. Ex- 
periences of this sort have placed the adver- 
tising industry on the alert for new methods 
of testing the impact of advertisements 
before the publication stage. 

One possible way of approaching this pro- 
blem is through the use of the galvanic skin 
response (GSR). Briefly, the GSR results 
from changes in skin resistance to an induced 
current. These changes of resistance may be 
read off a scale by deflections from a mirror 
galvanometer. It is assumed by many in- 
vestigators that the GSR is a reflection of 
general autonomic nervous activity and that 
autonomic activity accompanies emotional 
states; hence GSR may be considered as one 
possible index of emotion. Applied to an 
advertisement, then, a layout of highly af. 
fective components for any particular observer 
should result in greater GSR deflections than 
a less affective one. The GSR method if 
found valid, has the advantage of being a 
low-cost, time-saving technique. 

With this in view, investigators have begun 

* thors wish to thank N. W. Ay 
mesi. for financial aid whieh ede dime, 
possible. Special thanks are due Mr. James M. Wallace 
Vice-President, and Mrs. Margaret H. Rogers, à 


Copy Research Bureau, for their interest à 
eration. 


Director, 
nd coop- 


440 


experimenting with various types of GSR ap- 
paratus and advertising media. Among the 
limited amount of research in this area, the 
most prominent is the study by Eckstrand and 
Gilliland (1). These investigators found a 
significant agreement between GSR changes 
to a series of advertisements and the sales 
effectiveness of those advertisements. They 
concluded that the psychogalvanic method of 
testing advertising material could be a predic- 
tive technique if handled under properly 
controlled conditions. ! 
It was because of this rapidly growmg 
interest in GSR as a copy testing method that 
the authors in conjunction with an adver- 
tising corporation attempted to set up a pre- 
liminary factorial experiment that would 
indicate whether the GSR of a group of sub- 
jects would show significant differences whe? 
a series of advertisements concerning the 
same product was presented. This was to be 
a basic study and, as such, was not to involve 
other criteria such as subjective reports, sales 
tests, coupon returns, and so on. If the GSR 
did indicate significant differences among ad- 
vertisements, then this couldjbe used a$ the 


basis for further investigation and interpre" 
tation. 


The Experiment 


Adverlising Material. 
chosen ( 


e 
Three products We” 


; nose tissue, ice cream, and air travel) 
which were considered within the range ? 
interests of the subjects tested. These P10” 
ducts were given fictitious brand name 
Four advertisements. were prepared for ea" 
product, varying the arrangement, or layout: 
Each ad consisted of a short slogan (three 
SIX words), a picture, and the “brand” name 
Layout A, for each product, featured promi? 
ently a picture of the article or service involV* 
ot example, a large box of nose tissue O7 a 
Picture of an airliner). Layout B included 
Picture of a single person using the advertise > 
a (e.g. a boy eating an ice cream w^ 
ayout C Portrayed two human figures b 


mal j ; je: 
e and one female, in poses which W 


d 


Galvanic Skin Response as a Test of Advertising Impact 


not directly related to the product. Layout 
D included pictures of miscellaneous content 
differing from Layouts A, B, and C in that the 
pictures were more elaborate and contained a 
much greater amount of detail. For all 
layouts, the slogans were appropriate to the 
picture and the product, and were devised in 
accordance with good advertising practice. 
The brand name was at the bottom of the 
advertisement in each case; but for each prod- 
uct, the positions of picture and slogan were 
alternated, the picture being at the top in 
half the ads, and the slogan at the top in the 
remaining half. The twelve ads were photo- 
graphed in black and white, and 2 X 2 inch 
glass slides were prepared for each. These 
ads were supplied by the Copy Department 
of N. W. Ayer & Son, Inc. 

Subjects. The subjects were sixty male 
undergraduate and graduate students ranging 
from eighteen to thirty-three years of age. 

Apparatus. The GSR apparatus was simi- 
lar to the alternating current circuit described 
by Grant (3). A Rubicon mirror galvanom- 
eter was used and the deflections accompany- 
ing changes in skin resistance were read off a 
scale graduated in centimeters. 

Electrodes for the GSR apparatus were 
dry, polished silver discs attached to the palm 


Order: 12 8 4 
Advertisement: 3C 1B 2D 3A 


and back of the right hand. Constant pres- 
Sure was maintained by a clamp arrangement 
Which, though comfortable, restricted excess 
and movement. f 
The slides were projected on a daylight 
Screen by a Keystone viewer operated on 
time exposure. 
. Procedure. The experiment was conducted 
M a dimly lighted room so situated that there 
Was a minimum of distraction from the general 
activity of the building. The subject was 
comfortably seated approximately ten feet 
Tom the screen and so situated that the GSR 
®pparatus and the viewer were not within his 
ine of vision, The size of the image on the 
Creen was approximately three feet xp 
he electrodes were attached to the sub- 
Jects hand and at least five minutes were 
allowed for adaptation of the subject and 


441 


stabilization of GSR light deflections. In- 
structions were given to observe the adver- 
tisements as if the subject were glancing 
through a newspaper or magazine. A sample 
slide was presented followed by the twelve 
test slides. 

The order of presentation was planned in 
accordance with the Latin square principle 
in such a way that over the group of 60 sub- 
jects each advertisement occupied a given 
ordinal position the same number of times. 
The twelve ads were arranged into five 12 x 
12 Latin squares, and each of the 60 subjects 
was assigned at random to one of the rows of 
one square. The influence of contiguity upon 
responses to consecutive ads dealing with the 
same product or having the same layout was 
controlled by: designing the Latin squares so 
that the product and layout effects were dis- 
tributed systematically over the twelve ordi- 
nal positions. The first four ads in any row 
constituted one permutation of the four lay- 
outs, the second set of four was another 
permutation of the layouts, and the third set 
of four still another permutation. Similarly, 
the first and each succeeding set of three ads 
was a permutation of the three products. 
For example, the sequence presented to one 


subject was: 


5 6 7 8 9 10 11 12 
2B 1D 2A 1C 3D 1A 2C 3B 


Here the numbers 1, 2, 3 identify the three 
products, and the letters A, B, C, D, the four 
layouts; ie, "3C" indicates that the first 
stimulus presented to this subject was the ad 
representing Product 3 in Layout (C. ‘The 
first group or block of four, comprising orders 
1-4, has each of the four layouts represented 
once; and so do the two remaining blocks, 
5-8 and 9-12. Orders 1-3 for this subject in- 
clude each of the three products, as do orders 
4-6, 7-9, and 10-12. The order of presentation 
was, of course, different for every subject, 
since 60 permutations of the basic sequence 
were used, each fulfilling the same conditions 
concerning the distribution of products and 

routs. 
e layouts were projected for four seconds, 
during which time the GSR deflections were 
recorded. There was an interval of approx- 


442 
imately thirty seconds between the presenta- 
tions of successive stimuli. 


Results 


Table 1 contains the mean deflections in 
centimeters for each of the twelve advertise- 
ments individually, and the means for the 
product and layout groups. It will be noted 
that the mean deflections for the three pro- 
ducts vary but little. The means for layouts, 
however, show considerably greater disper- 
sion; the largest being associated with Layout 
C, which featured pictures of a young couple 
(a not unlikely result from a sample of college 
men), and the smallest with Layout A, the 
chief characteristic of which was a large picture 
of the product or service advertised. 

As is often the case with data of this kind, 
distributions of the deflections were positively 
skewed, and there was a distinct tendency for 
the means and standard deviations among 
subjects to vary together. Applications of the 
usual statistical tests led to the rejection of 
the assumptions of normality and of homo- 
geneity of variances and to the application of 
a logarithmic transformation to the data. The 
resulting distributions were more nearly normal 
and Bartlett’s chi-square test of homogeneity 
(2, pp. 195-197) indicated that the null hypoth- 
esis could not be rejected at the 10% level. 
This degree of correspondence was considered 
adequate to warrant the application of anal- 
ysis of variance. 

Table 2 contains the results of the analysis. 
The residual mean square was used as the 


Table 1 


Mean Deflection (in Centimeters) for the Twelve 
Advertisements Classified by Product 
and by Layout * 


Products 
1 Row 
Layouts 2 3 Means 
A 2.73 185 .2.64 241 
B 2.28 2.52 3.58 2.79 
[e 420 352 222 331 
D 266 305 2.76 2.82 


Column Means 2.97 2.74 


2.80 Grand Mean 2.83 


* N — 60 for each cell. 


Edwin Golin and Samuel Lyerly 


Table 2 


Analysis of Variance of Logarithmically 
Transformed GSR Data 


Sums of Mean 
Source Squares d.f. Square 
Products 433 2 .066 T4 
Layouts .867 3 .289 3.25" 
Subjects 18.880 59 320 3.60 
Interactions: 
PXL 2411 6 352 3.96% 
PXS 21230 — i18 — 180 — 2027 
LXS 21.241 177 .120 1.35* 
Residual 31.507 354 .089 
"Total 95.978 719 


* Significant at 5% level. 
** Significant at 1% level. 


denominator in all F-tests. This residual 
mean square includes, in addition to "error 
effects, any second-order interactions which 
may be present plus a component arising from 
response differences attributable to the order 
of presentation, or time sequence. The mea? 
GSR plotted against order of presentation 
yields a negatively accelerated curve, drop- 
ping from a mean of 3.70 cm. on the first 
presentation to 2.05 cm. on the twelfth an 
indicating that some sort of adaption was f 
progress throughout the experiment. pis 
consequence of the experimental design, p 
circumstance operates primarily to inflate E. 
residual mean square and has the effect p 
lowering all F’s. It would be possible 
make adjustments by taking account of ho 
regression of GSR on order of presentato. 
but the conclusions drawn from the St" 
would not be affected. spor’ 

From Table 2 it will be noted that diff?" 
ences among subjects and differences 2720 
layouts are significant. Differences ane 
Products are, if anything, smaller than 9^, 
Would expect to find by chance, though 
Significantly so. All three of the first-Or 
Interactions are significant. The interac. 
of products and layouts is interpreted 2° the 
dicating that, in the sample examine ayen 
GSR response to an advertisement of & GRET 
product depends upon the way the advert ye 
ment is presented; i.e., that the most effec ost 
kind of layout for one product is not 


et 


! 


——— C 


Galvanic Skin Response as a Test of Advertising Impact . 443 


effective for another. The two interaction 
effects involving subjects indicate that response 
tendencies to different layout styles and to 
different products vary from individual to 
individual, depending upon differences in 
needs, experiences, preferences, or other in- 
dividual characteristics. 

The interpretations presented above should 
not be extended to apply to advertised prod- 
ucts in general, to layouts in general, or to 
individuals in general. Neither the products 
nor the layout styles used are random samples 
of any clearly defined "populations" of prod- 
ucts or layouts. The fact that the products 
were given fictitious brand names means, of 
course, that these findings cannot be general- 
ized to advertisements of familiar brands 
toward which attitudes and response tenden- 
cies may be present. The sample of subjects 
(college men) is highly restricted, and yields 
no dependable predictions concerning responses 
to advertisements by people generally. 

A factor whose effects cannot be precisely 
evaluated in this design is the brightness 
differences among the projected slides. These 
differences, which were considerable, were not 
anticipated, and no provision could be made 
for them in the factorial analysis. The bright- 
ness of each slide was measured with a Weston 
Photometer and compared with the average 
GSR for that slide. The coefficient of cor- 
relation between brightness and average GSR 
18 —.06, which indicates little or no relationship 
between GSR and brightness. Means of the 
brightness measures grouped according to prod- 
"ct and layout do not demonstrate any signif- 
icant differences. It is concluded that this 
Variable did not unduly influence the exper- 


Mental findings. 


Summary and Conclusion 


The desirability of a method for selecting 
the most effective advertisement from a group 
of preliminary layouts is recognized by adver- 
tisers, and recently some use has been made 
of the galvanic skin response (GSR) for this 
purpose. A factorial experiment was designed 
using the GSR in an attempt to differentiate 
the reaction-producing characteristics of lay- 
outs. Twelve ads were used, comprising four 
layouts for each of three products. The GSR 
responses of sixty male subjects to these 
advertisements revealed significant differences 
among layout styles and among subjects, but 
no significant differences among the three 
products. All first-order interactions of prod- 
ucts, layouts, and subjects were significant. 

The results of the study demonstrate that 
the GSR is sensitive to differences in layouts 
and that further research may lead to the 
development of practical techniques for testing 
the effectiveness of advertising copy before 
the costly publication stage is reached. The 
study also serves to illustrate the importance 
of careful planning and design of GSR research, 
including the use of proper experimental con- 
trol and efficient statistical methods. 


Received August 1, 1950. 
Early publication. 


References 


1. Eckstrand, G., and Gilliland, A. R. The psycho- 
galvanometric method for measuring the effec- 
tiveness of advertising. J. appl. Psychol., 1948, 
32, 415-425. 

2. Edwards, Allen L. Experimental designs in psycho- 
logical research. New York: Rinehart and Co., 
1950. J 

3. Grant, D. A. A convenient alternating current 
circuit for measuring GSR’s. Amer. J. Psychol., 
1946, 59, 149-151. 


Book Reviews 


Anastasi, A., and Foley, J. P. Differential 
psychology (Rev. ed.). New York: Mac- 
millan, 1949. Pp. xv+894. $5.00. 


Anastasi’s 1937 Differential psychology is now 
enlarged by four new chapters covering the 
basic concepts of psychological testing, the 
biological and psychological factors in simple 
behavioral development, and the effects of 
schooling on intelligence. Other material, 
accrued since 1937, on trait organization and 
socio-economic differences has been added to 
the earlier discussion and is here reorganized 
into two substantially new chapters. Eighteen 
other chapters which conform closely to their 
former headings and sub-headings, have been 
re-written to include researches of more recent 
date. Outstanding inter-disciplinary contri- 
butions are certain studies from genetics, 
anthropology, and sociology. 

Sectional additions, scattered throughout the 
book, comprise: the measurement of special 

` aptitudes, conditions which affect the shape of 
the distribution curve, the nature of heredity, 
the nature of environment, the heredity- 
environment relationship, popular misconcep- 
tions regarding heredity and environment, 
structural and functional characteristics, the 
concept of unlearned behavior, the study of 
practice as an approach to the heredity-en- 
vironment problem, typical findings on the 
improvement of mental test performance in 
age, the constancy of the 1.Q., the study of 
family pedigrees, the search for components of 
physique and temperament, constitutional 
type as a social stereotype, profile analysis, 
what intelligence tests measure, a cultural con- 
cept of intelligence, evaluation of group dif- 
ferences, the comparative achievement of dif- 
ferent races, and the cultural frame of reference 
in behavior, developmental Stages, language, 
and human nature. 

The book is divided to orient the reader 
historically and methodologically, to present 
an analysis of individual differences, and then 
to discuss group differences. Studies from 
sociology and anthropology are integrated 
with research in psychology to give a larger 
perspective on common problems. 


T personnel program, Calhoon 


This second edition reflects a more studied 
consideration of content and aptness of ex- 
pression, and shows growth in unraveling 
theoretical complexities. The obscured opera- 
tion of heredity and environment in group 
differences is analyzed in a more mature fash- 
ion, attention being focused on newer insights 
from recent methodological refinements in the 
construction and application of psychological 
tests, from the constantly accruing longitudinal 
studies, and from the interest expressed in the 
fundamental nature of intelligence and per- 
sonality through analyses of their specific 
components. : 

The new edition affords a useful compendium 
for professional workers and students who can- 
not themselves read, digest, and coordinate 
original research publications. ‘The main ser- 
vice of this book, however, is in the training it 
gives in being alert for “common pitfalls and 
Sources of error in interpretation of obtained 
results,” with the authors repeatedly pune 
turing conclusions drawn from insufficient data; 
unreliable tests, poorly designed research eX 
perimentation, ineffective controls, inadequate 
statistical treatment, or just plain immaturity 
of psychological understanding. This secon 
edition is well documented, well illustrates 
well arranged, and well written. 


Gladys C. Schwesinger 
California Youth Authority 


Calhoon, Richard P. Problems in personnel 
administration. New York: Harper à? 
Brothers, 1949, Pp, xii-+540, $4.00. 


The emphasis in this book is on “the unde" 
Standing of personnel problems: the reason? 
therefor, with methods of attack and solution 
So that the feel of actual problems can be more 
real to the student of personnel administration 
Whether he is in the field or in the classroo™- 
With this objective in mind, Calhoon presents? 
Critical evaluation of present developments u 
personnel policy and programs. k 

hrough the twenty chapters in the bo?” 
alhoon treats the various kinds of problem 
faced by personnel administrators, class" A 
in the usual Way around selection, trainin’ 


v Eie 5 ach 
vage and salary administration, etc. For €^ 
phase of a a 


| an 


e ne 


Book Reviews 445 


tempts to show its place in the total program, 
to point out common errors or weaknesses of 
present practice, and to make suggestions for 
à sound program. Practical problems arising 
in actual administration are stressed through- 
out. So, for example, in discussing “executive 
training,” Calhoon judges the practice of a 
one-to-two year rotating program for college 
graduates as a “dubious procedure at best,” 
points out difficulties such as maintaining moti- 
vation and getting the trainee accepted by line 
supervisors and employees, and then presents 
the case for having general training follow 
rather than precede a definite job assignment. 
Literally hundreds of personnel problems are 
discussed in this fashion. The book thus be- 
comes a field manual for personnel adminis- 
trators and a guide for those in training for 
personnel administration. 

As such, its value depends upon the validity 
of the analyses, criticisms, and suggestions 
made. Unfortunately, there is inadequate re- 
search behind many of the opinions and beliefs 
upon which personnel policies and practices 
are now decided. To supplement his own 
opinions and experience, however, Calhoon re- 
ports the results of a questionnaire survey of 
nearly 600 personnel administrators on nearly 
all of the major personnel problems. In addi- 
tion, the NICB surveys and AMA publications 
are frequently referred to. It can be assumed 
that, right or wrong, here is a distillation of the 
best opinion in the field, at least as viewed and 
reported by one person. 

To facilitate its use in personnel training 
Situations, either in school or in industry, 
Calhoon has ended each chapter with practical 
discussion problems, projects, and role-playing 
Situations, These study aids are well chosen 
and, if used properly, will vitalize the use of the 

Ook as a text. Its greatest weakness, how- 
ever, is the lack of reference material since 
there is no attempt to survey the supporting 
Evidence in the literature and there are no 
lists of Suggested readings. Particularly neg- 
lected are references to psychological literature; 
only one psychological journal arti 
a few texts in industrial psychology 


? and those but briefly. 
Albert S. Thompson 


cle and only 
are referred 


Teachers College, 
Columbia University 


Stuit, Dewey B., Dickson, Gwendolen S., 
Jordan, Thomas F., and Schloerb, Lester. 
Predicting success in professional schools. 
Washington, D. C.: American Council on 
Education, 1949. Pp. v--187. $3.00. 


This book summarizes present knowledge 
gained from experience and research in pre- 
dicting success in professional schools. It is 
edited under the sponsorship of the Committee 
on Student Personnel Work of the American 
Council on Education and is based on a series 
of technical bulletins prepared by the Veterans 
Administration for use by vocational advisers. 
Professional fields included are engineering, 
law, medicine, dentistry, music, agriculture, 
teaching, and nursing. 

The already comprehensive compilation of 
information in the Technical Bulletins has been 
extended in its usefulness by a very good 
editing job. A valuable introductory chapter 
has been added on problems and techniques of 
prediction. This contains such a good dis- 
cussion of the problems of prediction, and 
limitations of research, that it would merit 
reading and re-reading by all practitioners. 
Mention is made of the absence of predictive 
knowledge relative to such factors as interest, 
motivation, and personality, but this is not, 
in the writer's opinion, strongly enough em- 
phasized, since the burden of effort is first to 
establish the importance of what quantitative 
findings exist. There is, however, a good dis- 
cussion of the dearth of investigation relative 
to the influence of environmental factors in the 
prediction of professional success. The authors 
wisely caution continuously, not only in this 
chapter, but also in the concluding statement 
for each profession, against applying general- 
izations for prediction to individual cases. _ 

The general discussion of tests as predictive 
indices omits raising the question of whether 
tests measure what they purport to measure. 
Tt is also somewhat disappointing that although 
there is emphasis on the restriction of the use 
of clinical instruments (i.e., personality tests) 
to professional workers with adequate back- 
ground in Clinical Psychology, there is not 
equal caution expressed regarding the inter- 
pretation and clinical use of any testing in the 


counseling process. . 
For each profession helpful general informa- 


446 


tion concerning national requirements precedes 
the presentation of research findings. The 
way in which this material is organized for 
presentation is particularly good. For ex- 
ample, the effect of length of pre-training, 
native vs. transfer training, etc., are considered 
separately, followed by an evaluation of each 
kind of test and combination of indices. A 
summary and comment on implications for 
counseling completes each section. The book 
is in highly readable form, an excellent job 
having been done in sifting out and simplifying 
the important findings. There is a useful 
table of essential qualifications for each field 
of professional training, with a suggested 
battery of tests. These test recommendations 
are open to question, especially recommenda- 
tions given in the personality area. In addi- 
tion, no mention is made of the fact that, in 
generalizing regarding the lack of predictive 
value of interest inventories, such as the Strong, 
only a single key has been used, when it is 
generally considered that patterning or con- 
stellation is more diagnostically significant. 

Despite these few omissions, this book is an 
invaluable compilation for any high school or 
college counselor. It would seem also that in 
carefully selected cases it might profitably be 
made available to counselees who are consider- 
ing one or another of the professional fields 
covered. In any case, filling a major gap, it is 
an imperative addition to the working materials 
of the professional counselor, 


(Mrs.) Barbara A. Ki 
Counseling Center, ) ra A. Kirk 


University of California, 
Berkeley, California 


Schramm, Wilbur, Editor. Mass communica- 


tions. Urbana: University of Illinois Press 
1949. Pp.xi 552. $4.50. i 


Mass communications is a collection of read- 
ings. Itclaims to be nothing beyond that, and 
the sub-title clearly states t 


: hat it is a book of 
readings. Consequently, there is no reason to 


expect that prospective purchasers will be 
misled into expecting anything else. 

There are several indications that this col. 
lection of readings fills a real need, rather than 
just being another book. Experience with the 
symposium, Communications in modern 


2 : Society, 
of the Institute of Communications R, 


esearch 


Book Reviews 


of the University of Illinois suggested the need 
for something that would be broader in scope. 
Furthermore, there were a number of more 
recent publications that deserved considera- 
tion; so the Director of the Institute selected 
and edited this collection of readings. Since 
all of the selections have been used in the Insti- 
tute, this publication rests on a background of 
actual experience. 

In a field as broad as this one, anyone who 
attempts to select and edit a limited number of 
readings obviously is confronted by several 
very difficult problems. His success in solving 
these problems largely determines the value of 
the publication. , 

Providing a satisfactory degree of continuity 
is one of these problems. Schramm's solution 
consists mainly of grouping the readings and 
giving a short introduction for each section. 
The section on the development of mass com- 
munications includes readings on the history. of 
the newspaper, motion pictures, radio and in- 
ternational communications. The control an 
support of mass communications are considere 
in a second section. The third section consists 
of five readings on various aspects of the com- 
munications process from the standpoints ° 
general semantics, psychology, and anthro- 
pology. The three remaining sections covet 
content, audiences, and effects. 

The appendix presents several pages of ê 
wide variety of facts and figures. 

Judged by conventional standards used t° 
evaluate most publications, this collection 9 
readings would have to be rated as spotty E 
Coverage and below par in continuity. HOW 
ever, it should not be judged by such standards. 

he gaps are the result of the characteristics ? 
this field and are beyond the control of any 
editor. The raw material for an integrate 
treatment of the subject just does not exist. 


ʻa the 

erhaps an even more serious problem 15 mi 

headache of deciding how much space to 8" ch 
each paper, ea 


E Complete reproduction of 
article and chapter selected would be po 
only by limiting coverage to a point at W^! 
the objective would not be accomplished. if 
© longest reading is the summary. 
Borden’s The economic effects of advertisi"® 
Which is 41 Pages. Second in length is Llew? " 
lyn White’s R agtime to riches (28 pages): wh 


sible. 


— 


| 


Book Reviews 447 


is one of the two readings by the same writer 
on the development of American radio. Only 
two other readings are longer than 20 pages. 
Twenty-three readings run ten to twenty pages. 
Fifteen readings, not counting the section in- 
troductions, are shorter than ten pages. 

Inclusion of the shortest readings raises a 
question. For example, the shortest reading 
is one page of definitions of the devices of 
propaganda from The fine art of propaganda. 
Another reading consists of three tables show- 
ing changes in heroes, settings, and themes in 
large-magazine fiction (from Johns-Heine and 
Gerth). The summary of Kenneth Baker’s 
study of radio programming takes up only two 
pages, and so does the presentation of the 
Flesch readability formula. 

At first glance these short readings appear to 
contribute to the impression of inadequate 
continuity. In addition, such treatment of a 
subject might be misleading: However, in- 
cluding them does increase the coverage of the 
volume as a whole. Since the book probably 
will be used most often in class situations that 
permit further explanation and since individual 
users outside of classes probably will consult 
the original sources, the reviewer agrees with 
Schramm’s decision to include them. 

When complete reproduction is not practical, 
one is faced with a problem of what to omit. 
In several such cases, Schramm reports results 
and conclusions at the expense of description of 
the methods. Although this may tend to cause 
some concern among technicians, it is the only 
practical approach which is in line with his 
objectives. Obviously, this book was not in- 
tended for research technicians in a strict sense. 
Many of the readings are what tough-minded 
research people probably would consider 
“essays,” although some of them are relatively 
quantitative. This is as it should be. Any 
other approach would present a distorted pic- 
ture of the broad field of mass communications. 

The usefulness of this book depends on the 
need that any particular person might have 
for a collection of readings on this subject. It 
provides convenient material for class purposes, 
and it at least minimizes the amount of digging 
necessary to get an introductory overall picture 
of this field. There is ample eviderice of both 
good judgment and experience in the selection 
and editing. Any deficiencies result from the 


nature of the available material and the weak- 
nesses inherent in any publication which is 
based upon the selection and editing of pre- 
viously published manuscripts. 
Alfred C. Welch 
Knox Reeves Advertising, Inc., 
Minneapolis 


Massing, Paul. Rehearsal for destruction. New 
York: Harper and Brothers, 1949. Pp. 341. 
$4.00. 


The book is well-written, carefully docu- 
mented, and scholarly in style, but is strictly 
a “who did what” (with dates) historical 
analysis. The absence of any systematic dis- 
cussion of the group or individual psychological 
determinants and factors in German anti- 
Semitism in the period covered (1871-1914) 
leaves little of interest except for the most 
zealous student of the subject. The broader 
purposes of the study may well have required 
this painstaking chronicling approach, but the 
result is a volume of only slight interest to most 
psychologists. 

, Harrison G. Gough 
University of California, 
Berkeley, California 


McCord, Carey P., and Witheridge, William 
N. Odors: Physiology and control. New 
York: McGraw-Hill, 1949. Pp. 405. $6.50. 
In these days of ever-increasing specializa- 

tion it may appear as temerity to attempt 

covering in one volume most of what is known 

about odors—from the anatomy of the olfac- 

tory system and the physiology of olfaction to 

the legal aspects of odor nuisances; from the 

theoretical considerations of the mode of action 
of the chemical stimuli to presentation of 
practical procedures for odor cancellation and 
counteraction; from classification of the odors 
to the use of odors as warning agents in fuel 
gases and as a method of broadcasting alarm 
in mining emergencies; from discussion of the 
rather unsuccessful efforts at clarification of 
the relationship between chemical Structure 
and the quality of odors to the control of some 
industrial odors by altering the chemistry of 
the manufacturing processes; from odor of the 
human body, in health and disease, to house- 
hold odors and odors of water and foods. ' The 


448 


authors were concerned, primarily, with the 
offensive odors and the engineering of odor 
control. The applied psychologist will read 
with some interest the chapters on industrial 
odors and the offensive trades. He might be 
consulted, on rare occasions, about making 
an odor survey; he will find in the present 
volume valuable practical suggestions. 

Because the area covered by the book is so 
large, many topics are treated rather sketchily. 
This appears to be particularly true of matters 
psychological. Thus, in connection with the 
question of pleasantness and unpleasantness of 
odors, mentioned only very briefly, the authors 
feel that "Such matters so invade the realm 
of emotions and habituations that further de- 
lineation must be delegated to the psycholo- 
gists” (p. 28). The short section on “mental 
perturbation" resulting from offensive odors 
(p. 71) is little more than a sample of “literary 
psychology." The experimental psychologist 
will note the new electrochemical theory of 
odor stimulation (p. 22), based on Linus 
Pauling's ideas about molecular architecture 
and biological reactions in general and odor 
sensations in particular, but he will be aw: 
of its highly speculative nature. 

There is a large and moderately useful bibli- 
ography (124 pp., some 3,500 references), ar- 
ranged by chapters. The authors stated that 
several hundred of the items were cited Írom 
secondhand sources. It would have been use- 
ful to identify this material by some simple 
symbol and to separate it from references 
actually seen. 


are 


Josef Brozek 


University of Minnesota 


Bellows, Roger M. and Rush, Carl H., Jr. 
Workbook in personnel methods, 
Towa: Wm. C. Brown Co., 1949, 
$2.10. 

This workbook is designed for use in an in- 
troductory course in personnel methods to 
accompany Bellows’ "Psychology of Personne] 
in Business and Industry." It attempts to 
“enable the personnel methods trainee actually 
to participate in personnel methods problems,” 
It emphasizes the objective approach to prob- 
lems and gives practice in the handling of 
quantitative data and in quantitative ap- 
proaches to human relations problems, 


Dubuque, 
Pp. 102. 


Book Reviews 


In line with this orientation, the authors 
have prepared exercises which require the 
analysis and interpretation of data such as 
might be obtained in an actual situation and 
which illustrate typical personnel techniques. 
So, for example, an exercise on the reliability 
of criteria deals with inter-rater agreement, 
one on recruitment requires computing aver- 
age test scores of applicants from three re- 
cruitment sources, one on selection involves an 
item analysis of application blank items, one 
on training requires plotting learning curves 
on two groups of industrial trainees, etc. 
Practice is given in the use of Taylor-Russell 
tables, an interview guide, an item discrimina- 
tion nomograph, norms, scattergrams, a merit 
rating form, a sociogram, the Flesch readabil- 
ity formula, Altogether 22 exercises related to 
14 of the 20 chapters in the text are provided. 

In general, the exercises are well worked out; 
the instructions are clear, the data are well 
presented, the interpretation questions are 
thought-provoking. The common danger of 
a workbook’s becoming merely a study guide 
to the text has been avoided. It is not the 
"case-study" type of textbook supplement but 
one which is designed to give the student an 
exposure to some of the simpler tools used in 
the scientific approach to personnel problems. 


Albert S. Thompson 
Teachers College, 


Columbia University 


Pressey, Sidney L. Educational acceleration: 
appraisals and basic problems. Columbus; 
Ohio, Ohio State University: Bureau 9 
Educational Research, 1949, Bureau of Ed- 
ucational Research Monographs, No. 31. PP: 
xiv + 154, $2.50, paper. $3.00, cloth. 
This book, which should be of interest to all 

= are concerned with policies in all levels © 

educational institutions, presents the results 
of a Fesearch project conducted at Ohio State 

nlversity, According to the author, “The 
effort was made broadly to investigate a prob- 

d 3. experimentally to try various means = 

Aling with it, and systematically to appr! 
© results,” In this, Pressey has produce 

& stimulating and valuable piece of work. 

ed til tlie last war resulted in pressure o 
ucationa] acceleration, discussion of it 2? 

ML. try it were largely characteri7° 


Pe 


Book Reviews 


by a priori thinking and little data. The 
changes brought about by the war, however, 
provided an opportunity to study the effects 
of acceleration on large numbers of students. 
The author did this extensively and consid- 
ered the problems carefully. 

Pressey has shifted the emphasis to the 
question of what justification there might be 
Íor continuing to prolong adolescent status 
and delay mature social functioning and pro- 
ductivity. He has found little or none. The 
work is organized around three basic concepts 
Which lend clarity and direction to the studies: 
maturity, prime (defined as peak biological 
potentiality), and individual differences. 

After chapters dealing with these basic con- 
cepts and with the opinions and studies of 
other educators and psychologists, the rest of 
the book seems to fall into three main sections: 
one part deals with the trend toward delayed 
completion of training; another presents the 
data bearing on the results of acceleration; 
the third is concerned with alterative methods 
of acceleration. 

In the latter two sections, which are the 
more important, Pressey shows that, contrary 
to popular notions, accelerated students tend 
not only to be superior in academic achieve- 
ment (even when matched with non-acceler- 
ates with equal aptitude), but they take part 
in as many social activities, and, after gradua- 
tion, make equally good or better job and life 
adjustments. Included in this section is a 
Study demonstrating the need for guidance in 
selecting students for acceleration. 

The major alternative method of accelera- 
tion (in addition to summer attendance and 
heavier course loads) suggested is credit by 
examination. This is shown to have no 
Adverse effects on later scholarship. In addi- 
tam re presented taken m uen 
ent codd DN ate i y t under 
undue st. Spon fetis p E wn is that 

efore e Tain, The conclusion dra evel 
five b oen guided acceleration e e to 
vies ther tonal institutions W1 ot aie 
tte €lr rules and mores to avol - d 
*ccelerates feel out of place or too pres i 
s tn Most serious criticism of i Lc ta 
ls v area of the presentation O be- 
Onclusions are based on differences 


449 


tween percentages of various groups meeting 
certain criteria. Yet nowhere is there pre- 
sented a test of the significance of these 
differences. Pressey says, for example, “The 
differences are so small as to be of questionable 
significance, each taken by itself. Taken to- 
gether, they show a consistency . ..." While 
such consistency may argue for the acceptance 
of the results, still professional readers will 
probably have as hard a time as the reviewer 
in understanding why so eminently able a 
person as Pressey should have left this crucial 
step in the analysis undone or unreported. 

Pressey concludes the book by reemphasiz- 
ing the need for guidance in any flexible 
program; he then makes eight specific recom- 
mendations which, he feels, might serve as 
guideposts to people willing to come to terms 
with the three basic concepts which he pre- 
sented at the beginning. 

Whatever the shortcomings of the analysis, 
Pressey has done a masterful piece of work in 
clearing the ground so that the real issues can 
be clearly seen; other investigators may well be 
able to explore the problems further and more 
adequately because of his work. 

John W. Gustad 


Vanderbilt University 


Weston, H. C. Sight, light and efficiency. 
London: H. K. Lewis and Co., Ltd., 1949. 
Pp. xiv + 308. 42s. 

This English treatise, based largely upon 
the results of recent research sponsored by the 
Medical Research Council and the Department 
of Scientific and Industrial Research in Great 
Britain, is concerned mainly with illumination 
in relation to industrial and occupational 
efficiency. After preliminary attention to the 
sense of ‘sight and causes and symptoms of eye- 
strain, consideration is given to occupational 
demands upon sight, the facilitation of visual 
tasks, lighting and visual efficiency, incentive 
luminance and color, testing vision, and pro- 
tection of sight at work. Recommended levels 
of illumination are listed in the appendix. 
There is an abundance of appropriate figures, 

s and illustrations. 

EU are many contributions in this book 

which may be cited with approval. Among 

these are the following: (1) The industrial 
psychologist, the illuminating engineer, and 


450 


others interested in visual efficiency will find 
material of practical valüe. (2) To a large 
degree the cited experimental data provide a 
sound basis for the conclusions presented. 
(3) One should know what acuity is needed for 
comfortable work rather than acuity just 
sufficient to permit the job to be done. (4) 
The extent to which contrast and size are 
interdependent factors affecting the severity 
of the visual task is clearly explained. 
(5) It is stated (page 170) that “however high 
illumination is made, objects which are not 
very nearly alike either in apparent size or 
contrast, or in both these characteristics, never 
become equally easy to see." These findings 
should help to correct the views of those 
American writers who claim that the effects 
of small size and poor contrast can be com- 
pletely compensated for by increases in 
illumination. (6) Methods of illuminating 
work objects are effectively described. (7) 
The psychological feeling of well-being that 
comes with adequate illumination combi 
with desirable combinations of color and 
brightness of painting receives a desirable 
emphasis. (8) For the most part, the schedule 
of recommended values of 
adequate (not excessive) in 
imental findings. 

Careful evaluation will raise 
certain specific details. A few 
icisms follow: (1) During recen 
often seemed practical to comp 
ination level needed for a Spec 
A job analysis reveals size of detail and 
brightness contrast. Then by reference to 
laboratory studies of visual acuity in relation 
to size of detail, contrast and illumination 
the level of illumination for the job is computed. 
Weston has worked out nomograms to faci]. 
itate such determinations, Although this 
procedure is helpful at the present stage of our 
knowledge, it is a dubious Practice. It is 
questionable whether it is valid to make a 
direct transfer from visual acuity data to a 
relatively complex seeing situation where inte- 
grated visual patterns are involved, Further- 
more, recommendations which rest on estimates 
from threshold measures have not in most 
cases been validated for the actua] Visual 
work activities. When this has been done 
the computed illumination values have been 


ned 


illumination are 
terms of exper- 


questions on 
of these crit- 
t years it has 
ute the illum- 
ific visual task, 


Book Reviews 


found to be greater than needed. (2) When 
the author cites the need of 500 foot candles 
or more for lace-making because workers prefer 
to work out in the sun rather than in poorly 
illuminated interiors (page 153), his inference 
is unwarranted. There is no other choice, 
such as a well lighted interior, available. : (3) 
It is possible that other factors than differ- 
ences in illumination (page 162) affect the 
shape of the work curves in winter as compared 
with summer. (4) The tentative inference 
(page 142) that white print on black back- 
ground is more readable than black on white 
is not supported by experimental findings. y 

Industrial psychologists and illumination 
engineers should welcome this well organized 
and well written book on the relation between 
illumination and visual efficiency. Careful 
reading should reveal how extravagant some 
of the illumination intensities recommended 
in this country are. 

Miles A. Tinker 


University of Minnesota 


Ward, Roswell. Oul-of-school vocational guid" 
ance. New York: Harper and Bros. 1949- 
Pp. xiv + 155. $2.50. 

Admittedly, the literature of vocational 
guidance dealing specifically with the “out-of 
school” population is limited, The author 
draws upon extended experience as a voca- 
tional counselor and as an organizer of out-of- 
school vocational guidance services in many 
states, in an attempt to fill the gap. 

“Vocational guidance” is held to be ? 
fact-finding, Coordinating, planning and !€; 
ferral service, The “vocational counselor 
Who is the chief figure in this service, 40° 
more than counseling (the "planning" part © 
Vocationa] guidance). His effectiveness IP 
determined most by his skill as a fact-finde? 
and his ability to coordinate and make use ° 
information and resources. EN 
The harrowing of the traditional definitio? 
of “Vocational guidance" shows its value bot? 
im the organization and in the operation O 
Vocational guidance service. Many practic? 
ideas are Presented for carrying on an org?" 
ned community service to meet the continui 
heed for Vocational guidance. 


nt 
eal, © author does not hesitate to pod 
"T dogmatically certain ideas which 


Book Reviews 


own experience has convinced him are valuable. 
Among these, the following are typical: 

The identification of the “non-selective”? 
Worker permits economies through the use of 
less intensive forms of counseling. 

Adherence to a classification of “vocational 
problems" according to symptoms is consist- 
ent with the defined function of the vocational 
counselor as a maker of plans for vocational 
adjustment and as an “action-centered” rather 
than “applicant-centered” fact-finder and re- 
ferral agent. The categories of this classifica- 
tion (each preceded by the word **Vocational") 
are: Immaturity, Confusion, Insecurity, Mis- 
direction, Fixation, Conflict. 

A sharp distinction between "interest" and 
“motivation” has implications for the fact- 
finding activities of vocational counseling. 
An eighteen-item classification of vocational 
motivations is presented. 

An insistence that vocational counseling 
must be tied basically to vocational informa- 
tion and vocational economics rather than to 
psychology, has implications for the selection 
and training of vocational counselors. Preju- 
dice is revealed here which is carried to the 
point that the training of vocational counselors 
would be taken out of schools of education 
and departments of psychology. The author 
likens the contrast between vocational guid- 
ance as an adjunct of education and as an 
out-of-school service, to the difference between 
fire Prevention and fire fighting. 

A criticism of “mirror counseling” is a 
further revelation of the same prejudice. 
Consistent With the fear of “mirror counseling” 
S a distrust of a psychologist’s ability to 
oe à Vocational counselor. The BO 
b mod be only à resource person mum a : 
ee applicant" (not "client") by referra 

ae counselor, 

One of the limitati han the fairly 
obvious eff € hmitations (other than cei 
the Yoline lic Er aed rm bs 
the “applica Un, the heavy dependence py 
filled-in iom nt" either in the os d b: 
essentia] 4. ©, for the data about m 
usual cate his vocational pn Um 
techniques 4, ud du Wn hem 
are left lapo t curing and evaluating tie 

argely to the judgment, attained 


Miet th i tional 
Counselon. En experience, of the voca 


451 


It is the opinion of the reviewer ‘that when 
the author's prejudices are fully discounted, 
the book stands as a practical challenge to 
many traditional practices. Also, the book 
should be thought-provoking, rather than 
offensive, to psychologists and educators 
concerned with carrying on counseling services 
and with the preparation of counselors. 

Fred M. Fowler 


Department of Public Instruction, 
The State of Utah 


Allport, G.W. The individual and his religion. 
New York: The Macmillan Company, 1950. 
Pp. x + 147. $2.50. 

Almost independently of its merits, this 
book is important in three respects. First, 
and I think of greatest significance, is the fact 
that an American psychologist of first rank 
and theoretical interests has addressed himself 
to this topic. As Allport points out, contem- 
porary psychologists avoid the subject of 
religion about as much as James and his 
contemporaries avoided any detailed treatment 
of sex. (Sexuality, by the way, gets strikingly 
little attention in Allport’s discussion of the 
organic “origins of the religious quest,” pages 
9-12.) The state of mind of modern Western 
man, including many of the intellectuals, is 
such that even the most atheistic psychologist, 
if interested in personality, ought to concern 
himself with the religious impulses in somewhat 
more detail than to dismiss them airily as 
“sublimations,” "failure of nerve," and the 
like. That a writer of Allport’s stature has 
done so may stimulate others, who do not 
share his scientific views about personality, 
to do the same. Secondly, this book will be 
useful to counselors as bibliotherapy for clients 
experiencing religious conflict, especially in the 
case of students whose conflicts arise via 
contact with psychology and allied sciences, 
Thirdly—and here I part company with 
Allport—the book is a challenge to the deter- 
mined and wholehearted secularist in psychol- 
ogy, because it is quite obiously not a disin- 
terested psychological causal analysis but 
includes a not-too-subtle bit of religious 
propaganda, the author's avowal (p. vii) to 
the contrary notwithstanding. I do not know 
the facts about Allport's own religious convic- 


tions, if any; but unless Iam mistaken he kas a 


452 


religious urge two sigmas above the American 
psychologists! mean. Let me cite a few ran- 
dom examples having this flavor. ‘The 
universe is simply incomprehensible. Frag- 
ments of it may be fairly well understood, 
but certainly not the design of the whole" 
(p. 18). “The fact . . . is that scientific thought 
is known . . . to be able to cover only part of 
the ground. It goes a long way but not far 
enough" (p. 20). “. . . even in this age of 
technology and social disintegration when 
skepticism hangs heavily upon the horizon. . .” 
(p. 39). 

Certain purely “psychological” theses are 
presented without the qualifications one would 
like to see in a book that will get into the 
hands of non-psychologists. On page 63, we 
find that “. . . in only slight degree, if at all, 
is this energy drawn from the reservoir of or- 
ganic drives"—treating Allport's important and 
respectable theory of autonomy as an estab- 
lished principle of psychological science. We 
are assured (p. 65) that if a Strong religious 
sentiment fades away, “. . . we can be certain 
that religion was never a central feature of the 
personality” (Allport’s form of Calvin’s doc- 
trine of perseverance, perhaps?), 

The book is marred by a few minor 
such as Allport’s canonizing of Thomas a 
Kempis (p. 124), and the lugging in of that 
dreadful old chestnut, so often refuted by 
careful logical analysis, about determinism’s 
incompatibility with praise, exhortation and 
effort (p. 115). The presentation of the 
positivist critique under “referential doubting" 
(pp. 117-121), especially the end of the para- 
graph at the top of page 119, is incredibl 
superficial. Positivists do not "condemn? 
poetry, for it does not claim to assert facts, 
"Legal discourse" is primarily a metalinguis- 
tic examination of “what entails what,” and 
is, so far as I know, admitted by most positi 
ists. The lumping together of “poetic axis. 
tic, legal” discourse as meaningless to a 
ists shows a carelessness which in a ess 
respected writer would almost Suggest a jen 
of intellectual integrity. 

Finally, the book fails to Satisfy 
a deep-seated methodological 
assumes that a really well- 
ical discussion of religion ci 
any + concern with the 


mistakes, 


because of 
Mistake, T, 
rounded Psycholog- 
an be given without 
truth-character of 


Book Reviews 


religious beliefs. By this device, Allport 
tries to have his cake and eat it too; he grinds 
the religious axe and at the same time avoids 
the necessity of defending any particular 
religious thesis. But certain of his “psycholog- 
ical” assertions can simply not be made unless 
the prior question of validity has been settled. 
For example, he several times rejects the 
common characterization of religion as “wish- 
fulfilling.” But he is at great pains to exhibit 
its role in satisfying needs. Now, if the 
objects of (propositional!) religious belief do 
not exist, or if the evidence for their existence 
is slight, those who have such beliefs, and have 
them on the psychological bases Allport 
describes, are surely engaging in wish-fulfilling, 
whether Allport likes the phrase or not! The 
constant use of evaluative terms and remarks 
shows that Allport is not carrying out à purely 
causal analysis, but he never comes to grip» 
except in a very sketchy and dilettante manner 
with the cognitive issues about religion which 
are involved. The book suffers here from me 
subjectivism, that making man the measure 
of all things (I am sure Allport would deny 
this!) which Bertrand Russell has described a 
the “characteristic madness of our times. 
For this reason, the philosophically sophis 

ticated clinician could use the book therapeu” 
cally only by permitting himself an essent od 
cynical view of the religiously com icte: 


Student—"it will help him, because he W0” 
see through it.” > 


ae Paul E. Meehl 
TüWersily of Minnesota 

jatin S 
ublica- 
Univer 


Tinkelman Sh : 1 
, erman. [ rea 
lest items, oe ee MP 


ome. New York: Bureau of 
«05, eachers College, Columbia 
Sity, 1947 Pp. 55. $1.85. 
mae a civil service examina ie 
choice 4^ Consisting of 100 5-option M 
we items Covering arithmetic ^ ing 
» logical inferen y 
judgment, vocabu!at’, aes 
mate, topics, the ca e thirty, d 
difficult fliculty of each item. TES at 
expected Per cent of patrolman ca” p» 
with Fi to answer correctly) was core 
dates page tal per cent of patrolman pst 
tuts, Passing, The judges “seemed to xamir 
a Professionally competent group of ¢ 


Book Reviews 453 


ners" with from three to thirty years of 
experience with the examining agency. The 
validity of the judges’ estimates of difficulty 
(correlation between estimated and actual 
per cent correct) ranged from .23 to .77 with a 
median of .53. 

The validity of the pooled judgment of the 
thirty judges was .76. Nine graduate stu- 
dents in an advanced test construction course, 
(presumably without three to thirty years 
experience) repeated the estimates for an 
entirely different test. For this trained, but 
inexperienced group (on a different test, to 
be sure), the validities ranged from .17 to .72 
and the validity of the pooled judgment 
(9 judges) was .81, higher than the value of 
.76 obtained from 30 experienced personnel 
examiners. 

Analysis of variance techniques were used 
to identify factors affecting the predictability 
of item difficulty. Predictability was unre- 
lated to item content—difficulty of vocabulary 
items was no more accurately predicted than 
was that of arithmetical reasoning and the 
differences among content categories could 
have arisen by chance. Judges were relatively 
consistent in their ability (or inability) to 
estimate or guess the difficulty of the items as 
measured by their performances on odd and 
even items. 

Even within the limits of “validity” repre- 
sented by a median coefficient of .53, there 
were large constant errors of over- or under- 
estimation. For only nine of the thirty judges 
were these constant errors statistically insig- 
nificant. Since the constant errors were in 
both directions, the pooled judgment was 
substantially accurate. 

The judges tended to overestimate the 
percentage of passes on difficult items and to 


underestimate it for easy items. The judg- 
ments thus exhibited the usual regression 
toward the mean. "There were no significant 
differences in predictability between “judg- 
ment” and "information" questions or between 
items with high and those with low item-test 
coefficients (erroneously termed validity coef- 
ficients). 

The problem is introduced in a civil service 
setting—and a rather conservatively conceived 
one. The necessity for basing the selection 
of items on subjectively judged estimates of 
difficulty is mentioned on page 1, but nowhere 
is the need for estimates of validity—subjective 
or statistical—alluded to. The impossibility of 
experimental tryout of items is ruled out 
because of the need for secrecy although 
progressive merit systems have found ways 
of tryout without violating security. 

The tone of the conclusions and implications 
seems unwarrantedly optimistic. “The re- 
sults of this study indicate that judges are 
probably able to estimate relative item dif- 
ficulty accurately." This conclusion does not 
seem warranted for a median validity correla- 
tion of .53. Moreover, although the "best" 
three judges have a combined validity of .85, 
the author fails to note that this is a “back- 
validated" coefficient, comparable to that 
computed for a test on the item analysis 
sample after the best items have been selected 
by item analysis. 

Although Tinkelman's book does shed light 
on the process of judging difficulty, it will not 
convince the careful examiner that estimated 


difficulty is an adequate substitute for empir- 
ical data. 


Charles I. Mosi 
Adjutant General's Office, Losier 
Washington, D. C. 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 


Department of Psychology, University 


Projective psychology. Lawrence E. Abt and Leopold 
Bellak, Editors. New York: Alfred A. Knopf, Inc., 


1950. Pp.485. $6.00. 
Converting a veterans guidance center. George D. Bara- 
hal. California: Stanford University Press, 1950. 


Pp. 99. $1.50. 

Beginning experimental psychology. S. Howard Bart- 
ley. New York: McGraw-Hill Book Co., Inc., 1950. 
Pp. 483. $4.00. 

Problem-solving processes of college students. Benjamin 
S. Bloom and Lois J. Broder. Chicago: University 
of Chicago Press, 1950. Pp. 109. $2.75. 

Readings in modern methods of counseling. A. H. Bray- 
field, Editor. New York: Appleton-Century-Crofts, 
Inc., 1950. Pp. 526. $5.00. 

Sex questions and answers. Fred Brown and Rudolf T. 
Kempton. New York: McGraw-Hill Book Co., Inc., 
1950. Pp. 256. $2.95. 

Scientific research: its administrat 
George P. Bush and Lowell 
Washington, D. C.: American 
Pp. 190. $3.25. 

Personality, a systematic, theo: 
Raymond B. Cattell. New York: McGraw-Hi 
Book Co., Inc., 1950, Pp. 689. $5.50. = 

The application of measurement to health and physical 
education. Second edition, H. Harrison Clarke 
New York: Prentice-Hall, Inc, 1950. Pp. 455, 

Recent experiments- in Psychology. Secon iti 
Leland W. Crafts, Theodore e Ede ie 
Robinson, and Ralph W. Gilbert, New York: 
McGraw-Hill Book Co., Inc., 1950. Pp. 503. $350. 

Some theory of sampling. William E. Deming. Nev 
York: John Wiley and Sons, Inc, 5 
$9.00. c., 1950. Pp. 602. 

Psychology. Second edition. 
New York: Prentice-Hall, 
$3.75. 

America begins. Richard M. Dorson, Editor 
York: Pantheon Books, Inc. 1950. bp.438 

The counseling interview. Clifford 
York: Prentice-Hall, Inc., 1950, 

Communicating ideas to the 
gerald. New York: Fun 
Pp. 292. $3.50. 

The principles of scientific research, 
Washington, D. C.: Public Affairs 
222. $3.25. 


The psychology of dictatorship. G. M. Gilbert. Ney 
York: Ronald Press Co., 1950. Pp. 327. $4.00" 


Outside readings in psychology. E. L, Hartley, H. G. 


ion and organization. 
H. Hattery, Editors. 
University Press, 1950, 


retical and factual study, 


Dockeray and Lane. 
Inc, 1950, Pp. 576. 


New 
$4.50. 
E. Erickson. New 
ex Pp.174. $1.75. 

ublic. Stephen E, Fitz- 
k and Wagnalls Co., 1950, 


Paul Freedman, 
Press, 1950, Pp. 


of Minnesota, Minneapolis 14, Minnesota 


Birch, and R. E. Hartley, Editors. New York: 
Thomas Y. Crowell Co., 1950. Pp. 880. E 

Clinical a pplications of suggestion and hy pnosis. a 
T. Heron. Springfield: Charles C s 
lisher, 1950. Pp. 125. $3.00. JA 

Textbook of abnormal psychology. Revised edition. 
Carney Landis and Marjorie M. Bolles. New York: 
The Macmillan Co., 1950. Pp, 634. $5.00. 

Psychiatry for social workers. Second edition. Lawson 
G. Lowrey. New York: Columbia University Press, 
1950. Pp. 385. $4.50. 

In my mind’s eye. Frederick Marion. New York: 
E. P. Dutton and Co., Inc. 1950. Pp. 315. $3.75; 

Carbon dioxide therapy, a neurophysiological treatment 
of nervous disorders. L. J. Meduna. Springfiel à 
Charles C Thomas, Publisher, 1950. Pp. 
$5.00. " 

Experiments in social process. James Grier Miller, 
Editor. New York: McGraw-Hill Book Co., Inc» 
1950. Pp. 201. $3.00. 

Physiological psychology. Second edition. Clifford A 
Morgan and Eliot Stellar. New York: McGraw-Hi 
Book Co., Inc., 1950, Pp. 572. $5.00. 

Psychology and art of the blind. G., Revesz. 
"y - Green and Co, Inc., 1950. 


New York: 


pp. 338 
The development of a needs and problems inventory us 
high-school youth. Benjamin Shimberg. L4 
Ind.: Studies in Higher Education LXXlL 


Division of Educational Reference, Purdue Univer- 
sity, 1950. Pp, 78. 


Delinguency and human nature. D. H. Stott. P 
Scotland: Carnegie United Kingdom Trusts 
fermline, 1950, Pp.460. $1.00. 


General man" 


Man- 


Improving management communication. 
agement series No. 145, New York: American 
agement Association, 1950, Pp. 26. $.75- tion 

Practical operating problems in personnel adminis ee a 
Personnel series No. 129. New York: a 

" anagement Association, 1949, Pp. 28. $4 per 

ustrial applications of medicine and psychiatry: Man- 
sonnel series No, 130. New York: American 
agement Association, 1949. Pp. 31. $75. 1 and 
erman aviation medicine, World War II. 2 


ral, 
jigpared under auspices of the Surgeon an of 

A. : auri 
Documents, o e eme D. C.: Superint ^ 1950: 


Pp. 1302. $8.50" 
. -50, 
ages, em ‘ploym, Pats 


: nging 
ent and personnel problems in 0 car 


[^] 
Perna Production series No. 187. NEN . 
1.25 can Management Association, 1949. 


454 


